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Description 

The present invention relates to isolated human serine protease (PSP1 ) polynucleotides, their homologs and iso- 
forms and polymorphic variants and their detection; to essentially pure PSP1 proteins; and to compositions and meth- 

s ods of producing and using PSP1 polynucleotides and proteins. 

Mutations in the presenilins (PS-1 and PS-2) account for -95% (75% and 20%, respectively) of all cases of early 
onset familial Alzheimer's disease (FAD). See R. Sherrington era/., Nature 375, 754-760 (1995); E.I. Rogaev etai, 
Nature 376, 775-778 (1995); and E. Levy-Lahad et al, Science 269, 973-977 (1995). The presenilins are highly ho- 
mologous (67% identical), multi-membrane spanning proteins whose function is unknown. 

10 it has been demonstrated that the 46 kDa full-length PS-1 protein is normally processed to 28 kDa and 18 kDa 

fragments; PS-2 has been reported to be similarly cleaved. See M. Mercken et al, FEBS Letters 389, 297-303 (1 996). 
The predicted cleavage site(s) to account for fragments of this size would be in a region of the protein coded for by 
exon 8 and exon 9. Exon 8 is a hot spot for mutations leading to FAD. Thus, this region of PS-1 , and potentially the 
cleavage of PS-1 in this region by a presenilinase protease, are important events in the functionality of the protein. A 

is region of PS-1 spanning exons 8-11 has been demonstrated in the present invention to specifically bind a protease, 
PSP1, whose activity against its endogenous substrates and/or ability to bind to PS-1 are important in the pathology 
of neurodegeneration associated with AD, frontal lobe dementia, cortical lewy body disease, dementia of parkinson's 
disease, acute and chronic phases of degeneration following stroke or head injury, neuronal degeneration found in 
motor neurone disease, AIDS dementia and chronic epileps. Thus, a need exists for provision of the nucleotide and 

20 amino acid sequences corresponding to PSP1 , for modulators of PSP1 binding to PS-1 , and/or modulators of PSP1 's 
proteolytic activity, for methods to identify such modulators and for reagents useful in such methods. 

Accordingly, one aspect of the present invention is an isolated polynucleotide encoding a biologically active PSP1 
polypeptide. 

Another aspect of the invention is an isolated polynucleotide selected from the group consisting of: 

25 

(a) a polynucleotide encoding PSP1 -1 having the nucleotide sequence as set forth in SEQ ID NO: 24 from nucle- 
otide 603 to 1979; and 

(b) a polynucleotide substantially similar to SEQ ID NO: 24. 

30 Another aspect of the invention is an isolated polynucleotide selected from the group consisting of: 

(a) a polynucleotide encoding PSP1 -2 having the nucleotide sequence as set forth in SEQ ID NO: 23 from nucle- 
otide 603 to 1 979; and 

(b) a polynucleotide substantially similar to SEQ ID NO: 23. 

35 

Another aspect of the invention is an isolated polynucleotide selected from the group consisting of: 

(a) a polynucleotide encoding PSP1-3 having the nucleotide sequence as set forth in SEQ ID NO: 26 from nucle- 
otide 603 to 1736; and 
40 (b) a polynucleotide substantially similar to SEQ I D NO: 26. 

Another aspect of the invention is an isolated polynucleotide selected from the group consisting of: 

(a) a polynucleotide encoding PSP1 -4 having the nucleotide sequence as set forth in SEQ ID NO: 28 from nucle- 
us otide603to1913;and 

(b) a polynucleotide substantially similar to SEQ ID NO: 28. 

In a further aspect the invention provides any isolated polynucleotide as above defined wherein nucleotides 672 
and 1435 are independently selected from C and T, hereinafter referred to as 'polymorphic variants'. 
so Another aspect of the invention is the functional polypeptides encoded by the polynucleotides of the invention. 

Another aspect of the invention is an antisense oligonucleotide comprising a sequence which is capable of binding 
to the polynucleotides of the invention or D87258. 

Another aspect of the invention is modulators of the polypeptides of the invention or of D87258. 
Another aspect of the invention is a method for assaying a medium for the presence of a substance that modulates 
55 PSP1 or D87258 activity by affecting the binding of PSP1 or D87258 to cellular binding partners comprising the steps of: 

(a) providing a PSP1 or D87258 protein having the amino acid sequence of PSP1 -1 , PSP1 -2, PSP1 -3 or PSP1 -4 
. or D87258, or a functional derivative or polymorphic variant thereof and a cellular binding partner or synthetic 
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analog thereof; 

(b) incubating with a test substance which is suspected of modulating PSP1 or D87258 activity under conditions 
which permit the formation of a PSP1 or D87258 protein/cellular binding partner complex; 

(c) assaying for the presence of the complex, free PSP1 or D87258 protein or free cellular binding partner; and 
5 (d) comparing to a control to determine the effect of the substance. 

Another aspect of the invention is a method for assaying a medium for the presence of a substance that modulates 
PSP1 or D87258 activity by inhibiting proteolytic activity on a cellular substrate comprising the steps of: 

10 (a) providing a PSP1 or D87258 protein having the amino acid sequence of PSP1 -1 , PSP1 -2, PSP1-3 or PSP1 -4 

or D87258, or a functional fragment or polymorphic variant thereof and a cellular substrate or synthetic analog 
thereof; 

(b) incubating with a test substance which is suspected of inhibiting PSP1 or D87258 activity under conditions 
which permit the formation of a PSP1 enzyme/substrate complex and subsequent cleavage of the substrate; 
is (c) assaying for the presence of proteolyticaliy cleaved substrate; and 

(d) comparing to a control to determine the effect of the substance. 

Another aspect of the invention is a method for assaying for the presence of a substance that modulates PSP1 or 
D87258 activity by direct binding to PSP1 or D87258 protein comprising the steps of: 

20 

(a) providing a labelled PSP1 or D87258 protein having the amino acid sequence of PSP1 -1 , PSP1 -2, PSP1 -3 or 
PSP1-4 or D87258 or a functional derivative or polymorphic variant thereof; 

(b) providing solid support-associated modulator candidates; 

(c) incubating a mixture of the labelled PSP1 or D87258 protein with the support-associated modulator candidates 
25 under conditions which can permit the formation of a PSP1 protein/modulator candidate complex; 

(d) separating the solid support from free soluble labelled PSP1 or D87258 protein; 

(e) assaying for the presence of solid support-associated labelled protein; , 

(f) isolating the solid support compiexed with labelled PSP1 or D87258 protein; and 

(g) identifying the modulator candidate. 

30 

Another aspect of the invention is PSP1 or D87258 protein modulating compounds identified by the methods of 
the invention. 

Another aspect of the invention is a method for the treatment of a patient having need to modulate PSP1 or D87258 
activity comprising administering to the patient a therapeutically effective amount of the modulating compounds of the 
35 invention. 

Another aspect of the invention is a method of diagnosing conditions associated with PSP1 or D87258 protein 
deficiency which comprises: 

(a) isolating a polynucleotide sample from an individual; 
40 (b) assaying the polynucleotide sample and a polynucleotide of the invention encoding PSP1 or D87258; and 

(c) comparing differences between the polynucleotide sample and the PS P or D87258 polynucleotide, wherein 
any differences indicate mutations in the PSP1 or D87258 sequence. 

Another aspect of the invention is a method of treating conditions which are related to insufficient PSP1 or D87258 
45 protein function which comprises: 

(a) isolating cells from a patient deficient in PSP1 or DS725S protein function; 

(b) altering the cells by transfecting the polynucleotide of the invention or D87258 into the cells wherein a PSP1 
or D87258 protein is expressed; and 

50 (c) introducing the cells back to the patient to alleviate the condition. 

Another aspect of the invention is a method of treating conditions which are related to insufficient PSP1 or D87258 
protein function which comprises administering the polynucleotide of the invention to a patient deficient in PSP1 protein 
function wherein a PSP1 or D87258 protein is expressed and alleviates the condition. 
55 Another aspect of the invention is an antibody immunoreactive with PSP1 or D87258 or an immunogen thereof. 

Another aspect of the invention is a transgenic non-human animal capable of expressing in any cell thereof the 
polynucleotide of the invention. 

Another aspect of the invention is a method for determining the genetic predisposition to neurodegeneration in a 



3 



EP 0 828 003 A2 



patient comprising detecting PSP1 or D87258 polymorph isms in a sample from a patient. Yet another aspect of the 
invention is isolated polynucleotide having the nucleotide sequence as set forth in SEQ ID NO: 32, 33, 34, 35, 36 ,37, 
38, 39, or 40. 

Figure 1 is an amino acid sequence alignment of PSP1-1 with £ coli htrA. 
5 Figure 2 is a multiple cDNA sequence alignment of the PSP1 isolates PSP1-1, PSP1-2, PSP1-3 and PSP1-4. 

Figure 3 is an amino acid sequence alignment of PSP1-1 with a putative human serine protease. 
As used herein, the term "PSP1 polynucleotide" or m PSP1 m refers to DNA molecules comprising a nucleotide se- 
quence that encodes PSP1 and alternative splice variants, i.e., homologs and isoforms, and polymorphic variants. 
PSP1 binds to a region encompassing amino acids 269-41 3 of the human PS-1 protein, contains a conserved serine 
10 protease motif and exhibits homology to the E. coli serine protease htrA described by Lipinska et ai in Nucl. Acids 
Res. 16, 1 0053-1 0066 (1 988) and a putative human serine protease with an IGF-binding motif (Ohno, I., etaL, Genbank 
Accession No. D87258 (1996)), hereinafter referred to as D87258. 

The PSP1-1 sequence is listed in SEQ ID NO: 24. The coding region of this sequence consists of nucleotides 
603-1 979 of SEQ ID NO: 24. The deduced 458 amino acid sequence of the encoded product PSP1 -1 is listed in SEQ 
15 ID NO: 25. 

The PSP1-1 sequence listed in SEQ ID NO: 30 includes two polymorphic variants, at nucleotides 672 (C/T) and 
1435 (C/T) resulting in alternative amino acid residues at position 24 (arg/cys) and 278 (ala/val), both in the conserved 
region of nucleotides 1-1540. The deduced 458 amino acid sequence of the encoded product PSP1-1 is listed in SEQ 
ID NO: 31. 

20 The PSP1 -2 sequence is listed in SEQ ID NO: 23. The coding region of this sequence consists of nucleotides 

603-1 979 of SEQ ID NO: 23. The deduced 458 amino acid sequence of the encoded product PSP1-2 is listed in SEQ 
ID NO: 8. The PSP1-3 sequence is listed in SEQ ID NO: 26. The coding region of this sequence consists of nucleotides 
603-1736 of SEQ ID NO: 26. The deduced 377 amino acid sequence of the encoded product PSP1-3 is listed in SEQ 
I D NO: 27. The PSP1-4 sequence is listed in SEQ ID NO: 28. The coding region of this sequence consists of nucleotides 

25 603-1 913 of SEQ ID NO: 28. The deduced 436 amino acid sequence ofthe encoded product PSP1-4 is listed in SEQ 
ID NO: 29. 

The D87258 sequence is listed in SEQ ID NO: 17. The coding region of this sequence consists of nucleotides 
49-1491 of SEQ ID NO: 17. The deduced 480 amino acid sequence of the encoded product D87258 is listed in SEQ 
ID NO: 18. The D87258 sequence listed in SEQ ID NO: 17 includes a polymorphic variant at nucleotide 1325 (G/T) 

30 resulting in alternative amino acid residues at position 21 3 (gly/val). The sequence in Genbank Accession No. D87258 
(1996)), describes only 1325G. The novel polynucleotide polymorph of D87258 having 1325T, is hereinafter referred 
to as D87258 (1325T) and the novel encoded product having valine at 213 is D87258 (1325T) protein. The novel 
polynucleotide D87258 (1 325T) and its encoded protein can replace PSP-1 in any of the composition, uses or methods 
herein described and such novel polypeptide, encoded protein, compositions, uses and methods also form part of the 

35 invention. 

As used herein, the term functional fragments' when used to modify a specific gene or gene product means a 
less than full length portion of the gene or gene product which retains substantially all of the biological function asso- 
ciated with the full length gene or gene product to which it relates. An example of a functional fragment of PSP1 is the 
minimal catalytic domain. To determine whether a fragment of a particular gene or gene product is a functional fragment, 
40 fragments are generated by well-known nucleolytic or proteolytic techniques or by the polymerase chain reaction and 
the fragments tested for the described biological function. 

As used herein, an "antigen" refers to a molecule containing one or more epitopes that will stimulate a host's 
immune system to make a humoral and/or cellular antigen-specific response. The term is also used herein interchange- 
ably with "immunogen." 

45 As used herein, the term "epitope" refers to the site on an antigen or hapten to which a specific antibody molecule 

binds. The term is also used herein interchangeably with "antigenic determinant" or "antigenic determinant site." 

As used herein, "monoclonal antibody" is understood to include antibodies derived from one species (e.g., murine, 
rabbit, goat, rat, human, etc.) as well as antibodies derived from two (or perhaps more) species (e.g., chimeric and 
humanized antibodies). 

50 As used herein, a coding sequence is "operably linked to" another coding sequence when RNA polymerase will 

transcribe the two coding sequences into a single mRNA, which is then translated into a single polypeptide having 
amino acids derived from both coding sequences. The coding sequences need not be contiguous to one another so 
long as the expressed sequence is ultimately processed to produce the desired protein. 

As used herein, "recombinant" polypeptides refer to polypeptides produced by recombinant DNA techniques; i.e., 
55 produced from cells transformed by an exogenous DNA construct encoding the desired polypeptide. "Synthetic" 
polypeptides are those prepared by chemical synthesis. 

As used herein, a "replicon" is any genetic element (e.g., plasmid, chromosome, virus) that functions as an auton- 
omous unit of DNA replication in vivo; i.e., capable of replication under its own control. 



4 



EP 0 828 003 A2 



As used herein, a "vector" is a replicon, such as a plasmid, phage, or cosmid, to which another DNA segment may 
be attached so as to bring about the replication of the attached segment. 

As used herein, a "reference" gene refers to the wild type PSP1 sequence of the invention and is understood to 
include the various sequence polymorphisms that exist, wherein nucleotide substitutions in the gene sequence exist, 
5 but do not affect the essential function of the gene product. 

As used herein, a "mutant" gene refers to PSP1 sequences different from the reference gene wherein nucleotide 
substitutions and/or deletions and/or insertions result in perturbation of the essential function of the gene product. 

As used herein, a DNA "coding sequence of or a "nucleotide sequence encoding" a particular protein, is a DNA 
sequence which is transcribed and translated into a polypeptide when placed under the control of appropriate regulatory 
10 sequences. 

As used herein, a "promoter sequence" is a DNA regulatory region capable of binding RNA polymerase in a cell 
and initiating transcription of a downstream (3' direction) coding sequence. For purposes of defining the present in- 
vention, the promoter sequence is bound at its 3' terminus by a translation start codon (e.g. , ATG) of a coding sequence 
and extends upstream (5* direction) to include the minimum number of bases or elements necessary to initiate tran- 
is scription at levels detectable above background. Within the promoter sequence will be found a transcription initiation 
site (conveniently defined by mapping with nuclease S1 ), as well as protein binding domains (consensus sequences) 
responsible for the binding of RNA polymerase. Eukaryotic promoters will often, but not always, contain "TATA" boxes 
and "CAT" boxes. Prokaryotic promoters contain Shine-Dai garno sequences in addition to the -10 and -35 consensus 
sequences. 

20 As used herein, DNA "control sequences" refers collectively to promoter sequences, ribosome binding sites, poly- 

adenylation signals, transcription termination sequences, upstream regulatory domains, enhancers and the like, which 
collectively provide for the expression (i.e., the transcription and translation) of a coding sequence in a host cell. 

As used herein, a control sequence "directs the expression" of a coding sequence in a cell when RNA polymerase 
will bind the promoter sequence and transcribe the coding sequence into mRNA, which is then translated into the 

25 polypeptide encoded by the coding sequence. 

As used herein, a "host cell" is a cell which has been transformed or transfected, or is capable of transformation 
or transfection by an exogenous DNA sequence. 

As used herein, a cell has been "transformed" by exogenous DNA when such exogenous DNA has been introduced 
inside the cell membrane. Exogenous DNA may or may not be integrated (covalently linked) into chromosomal DNA 

30 making up the genome of the cell. In prokaryotes and yeasts, for example, the exogenous DNA may be maintained 
on an episomal element, such as a plasmid. With respect to eukaryotic cells, a stably transformed or transfected cell 
is one in which the exogenous DNA has become integrated into the chromosome so that it is inherited by daughter 
cells through chromosome replication. This stability is demonstrated by the ability of the eukaryotic cell to establish 
cell lines or clones comprised of a population of daughter cells containing the exogenous DNA. 

35 As used herein, "transfection" or transfected" refers to a process by which cells take up foreign DNA and integrate 

that foreign DNA into their chromosome. Transfection can be accomplished, for example, by various techniques in 
which cells take up DNA (e.g., calcium phosphate precipitation, electroporation, assimilation of liposomes, etc.) or by 
infection, in which viruses are used to transfer DNA into cells. 

As used herein, a "target cell" is a cell that is selectively transfected over other ceil types (or cell lines). 

40 As used herein, a "clone" is a population of cells derived from a single pell or common ancestor by mitosis. A "cell 

line" is a clone of a primary cell that is capable of stable growth in vitrolox many generations. 

As used herein, a "heterologous" region of a DNA construct is an identifiable segment of DNA within or attached 
to another DNA molecule that is not found in association with the other molecule in nature. Thus, when the heterologous 
region encodes a gene, the gene will usually be flanked by DNA that does not flank the gene in the genome of the 

45 source animal. Another example of a heterologous coding sequence is a construct where the coding sequence itself 
is not found in nature (e.g., synthetic sequences having codons different from the native gene). Allelic variation or 
naturally occurring mutational events do not give rise to a heterologous region of DNA, as used herein. 

As used herein, a "modulator" of a polypeptide is a substance which can affect the polypeptide function, such as 
an inhibitor of enzymatic activity. 

50 An aspect of the present invention is isolated polynucleotides encoding a PSP1 protein and substantially similar 

sequences. Isolated polynucleotide sequences are substantially similar if they are capable of hybridizing under mod- 
erately stringent conditions to SEQ ID NOs: 23, 24, 26 or 28 or they encode DNA sequences which are degenerate to 
SEQ ID NOs: 23, 24, 26 or 28 or are degenerate to those sequences capable of hybridizing under moderately stringent 
conditions to SEQ ID NOs: 23, 24, 26 or 28. 

55 Moderately stringent conditions is a term understood by the skilled artisan and has been described in, for example, 

Sambrook etal Molecular Cloning: A Laboratory Manual 2nd edition, Vol. 1, pp. 101-104, Cold Spring Harbor Labo- 
ratory Press (1989). An exemplary hybridization protocol using moderately stringent conditions is as follows. Nitrocel- 
lulose filters are prehybridized at 65°C in a solution containing 6X SSPE, 5X Denhardt's solution (10g Ficoll, 10g BSA 
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and 10g polyvinylpyrrolidone per liter solution), 0.05% SDS and 100 ug/ml tRNA. Hybridization probes are labeled, 
preferably radiolabeled (e.g., using the Bios TAG-IT® kit). Hybridization is then carried out for approximately 18 hours 
at 65°C. The filters are then washed twice in a solution of 2X SSC and 0.5% SDS at room temperature for 15 minutes. 
Subsequently, the filters are washed at 58°C, air-dried and exposed to X-ray film overnight at -70°C with an intensifying 
5 screen. 

Degenerate DNA sequences encode the same amino acid sequence as SEQ I D NOs: 8, 25, 27 or 29 or the proteins 
encoded by that sequence capable of hybridizing under moderately stringent conditions to SEQ ID NOs: 8, 25, 27, 29, 
but have variation(s) in the nucleotide coding sequences because of the degeneracy of the genetic code. For example, 
the degenerate codons UUC and UUU both code for the amino acid phenylalanine, whereas the four codons GGX, 

10 where X = U, C, A, or G, all code for glycine. 

Alternatively, substantially similar sequences are defined as those nucleotide sequences encoding proteins having 
PSP1 activity in which about 70%, preferably about 80%, and most preferably about 90%, of the nucleotides share 
identity with PSP1, i.e., a sequence encoding a protein having PSP1 activity is substantially similar to any of SEQ ID 
NOs: 23, 24, 26 or 28 when at least about 70% of all of the nucleotides of the sequence match SEQ ID NOs: 23, 24, 

is 26 or 28. Nucleotide sequences that are substantially similar can be identified by hybridization or by sequence com- 
parison. 

Embodiments of the isolated polynucleotides of the invention include DNA, genomic DNA and RNA, preferably of 
human origin. A method for isolating a nucleic acid molecule encoding a PSP1 protein is to probe a genomic or cDNA 
library with a natural or artificially designed probe using art recognized procedures. See, e.g., "Current Protocols in 

20 Molecular Biology - , Ausubel et ai (eds.) Greene Publishing Association and John Wiley Interscience, New York, 
1989,1992. The ordinarily skilled artisan will appreciate that SEQ ID NOs: 23, 24, 26 or 28 or fragments thereof com- 
prising at least 15 contiguous nucleotides are particularly useful probes. It is also appreciated that such probes can 
be and are preferably labeled with an analytically detectable reagent to facilitate identification of the probe. Useful 
reagents include, but are not limited to, radioisotopes, fluorescent dyes or enzymes capable of catalyzing the formation 

25 of a detectable product. The probes would enable the ordinarily skilled artisan to isolate complementary copies of 
genomic DNA, cDNAor RNA polynucleotides encoding PSP1 proteins from human, mammalian or other animal sourc- 
es or to screen such sources for related sequences, e.g., additional members of the family, type and/or subtype, in- 
cluding transcriptional regulatory and control elements as well as other stability, processing, translation and tissue 
specificity-determining regions from 5' and/or 3* regions relative to the coding sequences disclosed herein, all without 

30 undue experimentation. 

Another aspect of the invention is functional polypeptides encoded by the polynucleotides of the invention and 
substantially similar polypeptides. An embodiment of a functional polypeptide of the invention is the PSP1 protein 
having the amino acid sequence set forth in SEQ ID NO: 8, 25, 27 or 29. 

Polypeptide sequences that are substantially similar are those sequences having PSP activity in which about 50%, 

35 . preferably 70%, and most preferably about 90%, of the amino acids share identity with PSPI, i.e., a sequence repre- 
senting a polypeptide having PSP1 activity is substantially similar to any of SEQ ID NOs: 8, 24, 26 or 28 when at least 
about 50% of all of the amino acids of the sequence match SEQ I D NOs: 8, 25, 27 or 29. Substantially similar polypeptide 
sequences can be identified by techniques such as proteolytic digestion, gel electrophoresis, microsequencing and/ 
or sequence comparison, e.g., through use of the GAP algorithm available from the University of Wisconsin Genetics 

40 Computer Group. 

Another aspect of the invention is a method for preparing essentially pure PSP1 protein. Yet another aspect is the 
PSP1 protein produced by the preparation method of the invention. This protein has the amino acid sequence listed 
in SEQ ID NOs: 8, 25, 27 or 29 and includes variants with a substantially similar amino acid sequence that have the 
same function. The proteins of this invention are preferably made by recombinant genetic engineering techniques by 

45 culturing a recombinant host cell containing a vector encoding the polynucleotides of the invention under conditions 
promoting the expression of the protein and recovery thereof. 

The isolated polynucleotides, particularly the DNAs, can be introduced into expression vectors by operatively link- 
ing the DNA to the necessary expression control regions, e.g., regulatory regions, required for gene expression. The 
vectors can be introduced into an appropriate host cell such as a prokaryotic, e.g., bacterial, or eukaryotic, e.g., yeast 

50 or mammalian cell by methods well known in the art. See Ausubel et at., supra. The coding sequences for the desired 
proteins, having been prepared or isolated, can be cloned into any suitable vector or replicon. Numerous cloning vectors 
are known to those of skill in the art and the selection of an appropriate cloning vector is a matter of choice. Examples 
of recombinant DNA vectors for cloning and host cells which they can transform include, but are not limited to, the 
bacteriophage (E. coli), pBR322 (E. coli), pACYC 177 (E. coli), pKT230 (gram-negative bacteria), pGV1106 (gram- 

55 negative bacteria), pLAFRI (gram-negative bacteria), pME290 (non-E. coli gram-negative bacteria), pHV14 (E. coli 
and Bacillus subtilis), pBD9 (Bacillus), plJ61 (Streptomyces), pUC6 (Slreptomyces), Ylp5 (Saccharomyces), a bacu- 
lovirus insect cell system, a Drosophila insect system, YCp1 9 (Saccharomyces) and pSV2neo (mammalian cells). See 
generally, "DNA Cloning": Vols. I & II, Glover et ai ed. IRL Press Oxford (1985) (1987); andT. Maniatis etal ("Molecular 
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Cloning" Cold Spring Harbor Laboratory (1982). 

The gene can be placed under the control of control elements such as a promoter, ribosome binding site (for 
bacterial expression) and, optionally, an operator, so that the DNA sequence encoding the desired protein is transcribed 
into RNA in the host cell transformed by a vector containing the expression construct. The coding sequence may or 
s may not contain a signal peptide or leader sequence. The proteins of the present invention can be expressed using, 
for example, the E. coli tac promoter or the protein A gene (spa) promoter and signal sequence. Leader sequences 
can be removed by the bacterial host in post-translatbnal processing. See, e.g., U.S. Patent Nos. 4,431 ,739; 4,425,437 
and 4,338,397. 

In addition to control sequences, it may be desirable to add regulatory sequences which allow for regulation of the 

10 expression of the protein sequences relative to the growth of the host cell. Regulatory sequences are known to those 
of skill in the art. Exemplary are those which cause the expression of a gene to be turned on or off in response to a 
chemical or physical stimulus, including the presence of a regulatory compound or to various temperature or metabolic 
conditions. Other types of regulatory elements may also be present in the vector, for example, enhancer sequences. 
An expression vector is constructed so that the particular coding sequence is located in the vector with the appro- 

15 priate regulatory sequences, the positioning and orientation of the coding sequence with respect to the control se- 
quences being such that the coding sequence is transcribed under the "control" of the control sequences, i.e., RNA 
polymerase which binds to the DNA molecule at the control sequences transcribes the coding sequence. Modification 
of the sequences encoding the particular antigen of interest may be desirable to achieve this end. For example, in 
some cases it may be necessary to modify the sequence so that it may be attached to the control sequences with the 

20 appropriate orientation; i.e., to maintain the reading frame. The control sequences and other regulatory sequences 
may be ligated to the coding sequence prior to insert ion into a vector, such as the cloning vectors described above. 
Alternatively, the coding sequence can be cloned directly into an expression vector which already contains the control 
sequences and an appropriate restriction site. 

In some cases, it may be desirable to produce mutants or analogues of PSP1 protein. Mutants or analogues may 

25 be prepared by the deletion of a portion of the sequence encoding the protein, by insertion of a sequence, ancWor by 
substitution of one or more nucleotides within the sequence. Techniques for modifying nucleotide sequences, such as 
site-directed mutagenesis, are well known to those skilled in the art. See, e.g., T Maniatis eta!., supra, "DNA Cloning, 
" Vols. I and II, supra; and "Nucleic Acid Hybridization", supra. 

Depending on the expression system and host selected, the proteins of the present invention are produced by 

30 growing host cells transformed by an expression vector described above under conditions whereby the protein of 
interest is expressed. Preferred mammalian cells include human embryonic kidney cells (293), monkey kidney cells, 
fibroblast (COS) cells, Chinese hamster ovary (CHO) cells, Drosophila or murine L-cells. If the expression system 
secretes the protein into growth media, the protein can be purified directly from the media. If the protein is not secreted, 
it is isolated from cell lysates or recovered from the cell membrane fraction. The selection of the appropriate growth 

35 conditions and recovery methods are within the skill of the art. 

An alternative method to identify proteins of the present invention is by constructing gene libraries, using the re- 
sulting clones to transform E. coli 'and pooling and screening individual colonies using polyclonal serum or monoclonal 
antibodies to PSP1 . 

The proteins of the present invention may also be produced by chemical synthesis such as solid phase peptide 

40 synthesis on an automated peptide synthesizer, using known amino acid sequences or amino acid sequences derived 
from the DNA sequence of the genes of interest. Such methods are known to those skilled in the art. 

The proteins of the present invention or their immunogenic fragments comprising at least one epitope can be used 
to produce antibodies, both polyclonal and monoclonal, directed to epitopes corresponding to amino acid sequences 
disclosed herein. If polyclonal antibodies are desired, a selected mammal such as a mouse, rabbit, goat or horse is 

45 immunized with a protein of the present invention, or its fragment, or a mutant protein. Serum from the immunized 
animal is collected and treated according to known procedures. Serum polyclonal antibodies can be purified by immu- 
noaffinity chromatography or other known procedures. 

Monoclonal antibodies to the proteins of the present invention, and to the immunogenic fragments thereof, can 
also be readily produced by one skilled in the art. The general methodology for making monoclonal antibodies by using 

50 hybridoma technology is well known. Immortal antibody-producing cell lines can be created by cell fusion and also by 
other techniques such as direct transformation of B lymphocytes with oncogenic DNA or transfection with Epstein-Barr 
virus. See, e.g., M. Schreier eta!., "Hybridoma Techniques" (1 980); Hammerling era/., "Monoclonal Antibodies and T- 
cell Hybridomas" (1981); Kennett eial, "Monoclonal Antibodies" (1980); and U.S. Patent Nos. 4,341,761; 4,399,121; 
4,427,783; 4,444,887; 4,452,570; 4,466,917; 4,472,500; 4,491,632; and 4,493,890. Panels of monoclonal antibodies 

55 produced against the antigen of interest, or fragment thereof, can be screened for various properties, i.e., for isotype, 
epitope, affinity, etc. Monoclonal antibodies are useful in purification, using immunoaffinity techniques, of the individual 
antigens which they are directed against. Alternatively, genes encoding the monoclonals of interest may be isolated 
from the hybridomas by PCR techniques known in the art and cloned and expressed in the appropriate vectors. The 
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antibodies of this invention, whether polyclonal or monoclonal have additional utility in that they may be employed as 
reagents in immunoassays, Rl A, ELISA, and the like. The antibodies of the invention can be labeled with an analytically 
detectable reagent such as a radioisotope, fluorescent molecule or enzyme. 

Chimeric antibodies, in which non-human variable regions are joined or fused to human constant regions (see, e. 
s g., Liu et ai, Proc. Natl Acad Sci. USA, 84, 3439 (1 987)), may also be used in assays or therapeutically. Preferably, 
a therapeutic monoclonal antibody would be "humanized" as described in Jones etai, Nature, 321, 522 (1986); Ver- 
hoeyen etai, Science, 239, 1534 (1988); Kabat etaL, J. Immunol., 147, 1709 (1991); Queen etai, Proa Natl Acad. 
Sci. USA, 86, 10029 (1989); Gorman etai., Proc. Natl Acad. Sci. USA, 88, 34181 (1991); and Hodgson etai., Bio/ 
Technology, 9:, 421 (1991). 

io Another aspect of the present invention is modulators of the polypeptides of the invention or of D87258. Functional 

modulation of PSP1 or D87258 by a substance includes partial to complete inhibition of function, such as inhibition of 
proteolytic activity, identical function, as well as enhancement of function. Embodiments of modulators of the invention 
include peptides, oligonucleotides and small organic molecules including peptidomimetics. Modulators of the invention 
may be useful as therapeutics or prophylactics for all forms of neurodegeneration including AD. Modulators of PSP1 

75 or D87258 proteolytic activity relative to other endogenous substrates may be also be useful for the treatment of other 
types of human disease states. 

Another aspect of the invention is antisense oligonucleotides comprising a sequence which is capable of binding 
to the polynucleotides of the invention. Synthetic oligonucleotides or related antisense chemical structural analogs can 
be designed to recognize, specifically bind to and prevent transcription of a target nucleic acid encoding PSPI or DS7258 

20 protein by those of ordinary skill in the art. See generally, Cohen, J.S., Trends in Pharm. Sci., 10, 435(1989) and 
Weintraub, KM., Scientific American, January (1990) at page 40. 

Another aspect of the invention is a method for assaying a medium for the presence of a substance that modulates 
PSP1 or DS7258 protein function by affecting the binding of PSP1 or D87258 protein to cellular binding partners. 
Examples of modulators include, but are not limited to peptides and small organic molecules including peptidomimetics. 

25 a PSP1 or DS7258 protein is provided having the amino acid sequence of PSP1 (SEQ ID NOs: 8, 25, 27 or 29) or 
D87258 (SEQ ID NO: 18) or a functional fragment thereof together with a cellular binding partner or synthetic analog 
thereof. The mixture is incubated with a test substance which is suspected of modulating PSP1 or D87258 activity, 
under conditions which permit the formation of a PSP1 or D87258 gene product/cellular binding partner complex. An 
assay is performed for the presence of the complex, free PSP1 or D87258 protein or free cellular binding partner and 

30 the result compared to a control to determine the effect of the test substance. 

Another aspect of the invention is a method for assaying a medium for the presence of a substance that modulates 
PSP1 or D87258 protein function by inhibiting its proteolytic activity on cellular substrates. Examples of modulators 
include, but are not limited to peptides and small organic molecules including peptidomimetics. Cellular substrates can 
include PS-1 , PS-2, APP or other substrates. A PSP1 or D87258 protein is provided having the amino acid sequence 

35 of PSP1 (SEQ ID NOs: 8, 25, 27 or 29) or D87258 (SEQ ID NO: 18) or a functional fragment thereof together with a 
cellular substrate or synthetic analog thereof. The mixture is incubated with a test substance which is suspected of 
inhibiting PSP1 or D87258 activity, under conditions which permit the formation of a PSP1 or D87258 enzyme/substrate 
complex and subsequent cleavage of the substrate. 

Another aspect of the invention is a method for assaying for the presence of a substance that modulates PSP1 or 

40 D87258 activity by direct binding to PSP1 or D87258 protein. Examples of modulators include, but are not limited to, 
peptides and small organic molecules including peptidomimetics. Modulator candidates are synthesized on a solid 
support by techniques such as those disclosed in Lam et ai, Nature 354, 82 (1 991 ) or Burbaum et ai, Proc. Natl. Acad. 
Sci. USA 92, 6027 (1 995) to provide solid support-associated modulator candidates. A labelled PSP1 or D87258 protein 
is provided having the amino acid sequence of PSP1 (SEQ ID NOs: 8, 25, 27 or 29) or D87258 (SEQ ID NO: 18) or a 

45 functional derivative thereof. Exemplary labels include directly attached fluorescent or colored dyes, biotin, radioiso- 
topes or epitope tags, which are detectable by a suitable antibody. A mixture of solid support-associated modulator 
candidates and labelled PSP1 or D87258 protein is incubated under conditions which can permit the formation of a 
PSP1 or D87258 protein/modulator candidate complex. The solid support is separated from free soluble labelled PSP1 
or D87258 protein. An assay is performed for the presence of solid support-associated labelled protein. Solid supports 

50 complexed with labelled protein are isolated and the identity of the modulator candidate determined by techniques well 
known to those skilled in the art, such as the TOF-SIMS method in Brummel etai, Sc/ence 264, 399-402 (1994). 

Modulation of PSP1 or D87258 function would be expected to have effects on presenilin cleavage, the cleavage 
of other proteins or pA4 production. Any modulators so identified would be expected to be useful as a therapeutic for 
the treatment and prevention of neurodegeneration including FAD and AD. 

55 Further, PSP1 or D87258 could be used to isolate proteins which interact with it and this interaction could be a 

target for interference. Inhibitors of protein-protein interactions between PSP1 or D87258 and other factors could lead 
to the development of pharmaceutical agents for the modulation of PSP1 or D87258 activity. 

Methods to assay for protein-protein interactions, such as that of a PSP1 or D87258 gene product/binding partner 
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complex, and to isolate proteins interacting with PSP1 or D87258 are known to those skilled in the art. Use of the 
methods discussed below enable one of ordinary skill in the art to accomplish these aims without undue experimen- 
tation. 

The yeast two-hybrid system provides methods tor detecting the interaction between a first test protein and a 
5 second test protein, in vivo, using reconstitution of the activity of a transcriptional activator. The method is disclosed 
in U.S. Patent No. 5,283,173; reagents are available from Clontech and Stratagene. Briefly, PSP1 cDNA is fused to a 
Gatt orLexA transcription factor DNA binding domain and expressed in yeast cells. cDNA library members obtained 
from cells of interest are fused to a transactivation domain of Gal4 or another transactivation domain. cDNA clones 
which express proteins which can interact with PSP1 will lead to reconstitution of transcription factor activity such as 
10 Gal4 and transactivation of a reporter gene expression such as Gall-lacZ. 

An alternative method is screening of A.gt11, AZAP (Stratagene) or equivalent cDNA expression libraries with re- 
combinant PSP1 . Recombinant PSP1 protein or fragments thereof are fused to small peptide tags such as FLAG, HSV 
or GST The peptide tags can possess convenient phosphorylation sites for a kinase such as heart muscle creatine 
kinase or they can be biotinylated. Recombinant PSP1 can be phosphorylated with ^[P] or used unlabeled and de- 
75 tected with streptavidin or antibodies against the tags. XgtUcDNA expression libraries are made from cells of interest 
and are incubated with the recombinant PSP1, washed and cDNA clones isolated which interact with PSP1 . See, e. 
g., T. Maniatis et al, supra. 

Another method is the screening of a mammalian expression library in which the cDNAs are cloned into a vector 
between a mammalian promoter and polyadenylation site and transiently transfected in COS or 293 cells followed by 

20 detection of the binding protein 48 hours later by incubation of fixed and washed cells with a labelled PSP1 , prefereably 
iodinated, and detection of bound PSP1 by autoradiography (See Sims et al., Science 241, 585-589 (1988) and Mc- 
Mahan et al, EM BO J. 10, 2821-2832 (1991)). In this manner, pools of cDNAs containing the cDNA encoding the 
binding protein of interest can be selected and the cDNA of interest can be isolated by further subdivision of each pool 
followed by cycles of transient transfection, binding and autoradiography. Alternatively, the cDNA of interest can be 

25 isolated by transfecting the entire cDNA library into mammalian cells and panning the cells on a dish containing PSP1 
bound to the plate. Cells which attach after washing are lysed and the plasmid DNA isolated, amplified in bacteria, and 
the cycle of transfection and panning repeated until a single cDNAclone is obtained (See Seed et al, Proc. Natl. Acad. 
Sci USA 84, 3365 (1 987) and Aruffo et al, EMBO J.6, 331 3 (1 987)). If the binding protein is secreted, its cDNA can 
be obtained by a similar pooling strategy once a binding or neutralizing assay has been established for assaying 

30 supematants from transiently transfected cells. General methods for screening supernatants are disclosed in V\fong et 
al, Science 228, 81CV815 (1985). 

Another alternative method is isolation of proteins interacting with PSP1 directly from cells. Fusion proteins of 
PSP1 with GST or small peptide tags are made and immobilized on beads. Biosynthetically labeled or unlabeled protein 
extracts from the cells of interest are prepared, incubated with the beads and washed with buffer. Proteins interacting 

35 with PSP1 are e luted specifically from the beads and analyzed by SDS-PAGE. Binding partner primary amino acid 
sequence data are obtained by microsequencing. Optionally, the cells can be treated with agents that induce afunctional 
response such as tyrosine phosphorylation of cellular proteins. An example of such an agent would be a growth factor 
or cytokine such as erythropoietin or interleukin-3. 

Another alternative method is immunoaffinity purification. Recombinant PSP1 is incubated with labeled or unla- 

40 beled cell extracts and immunoprecipitated with anti-PSP1 antibodies. The immunoprecipitate is recovered with protein 
A-Sepharose and analyzed by SDS-PAGE. Unlabelled proteins are labeled by biotinylation and detected on SDS gels 
with streptavidin. Binding partner proteins are analyzed by microsequencing. Further, standard biochemical purification 
steps known to those skilled in the art may be used prior to microsequencing. 

Yet another alternative method is screening of peptide libraries for binding partners. Recombinant tagged or labeled 

45 PSP1 is used to select peptides from a peptide or phosphopeptide library which interact with PSP1 . Sequencing of the 
peptides leads to identification of consensus peptide sequences which might be found in interacting proteins. 

PSP1 or D87258 binding partners identified by any of these methods or other methods which would be known to 
those of ordinary skill in the art as well as those putative binding partners discussed above can be used in the assay 
method of the invention. Assaying for the presence of PSP1 or D87258 /binding partner complex are accomplished 

so by, for example, the yeast two-hybrid system, ELISA or immunoassays using antibodies specific for the complex. In 
the presence of test substances which interrupt or inhibit formation of PSP1 or D87258 /binding partner interaction, a 
decreased amount of complex will be determined relative to a control lacking the test substance. 

Assays for free PSP1 or D87258, or binding partner are accomplished by, for example, ELISA or immunoassay 
using specific antibodies or by incubation of radiolabeled PSP1 or D87258 with cells or cell membranes followed by 

55 centrifugation or filter separation steps. In the presence of test substances which interrupt or inhibit formation of PSP1 
or D87258 /binding partner interaction, an increased amount of free PSP1 or D87258, or free binding partner will be 
determined relative to a control lacking the test substance. 

Another aspect of the invention is pharmaceutical compositions comprising an effective amount of a PSP1 or 
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D87258 modulator of the invention and a pharmaceutical^ acceptable carrier. Pharmaceutical compositions of mod- 
ulators of this invention for parenteral administration, i.e., subcutaneously, intramuscularly or intravenously or oral 
administration can be prepared. 

The compositions for parenteral administration will commonly comprise a solution of the modulators of the invention 

s or a cocktail thereof dissolved in an acceptable carrier, preferably an aqueous carrier. A variety of aqueous carriers 
may be employed, e.g., water, buffered water, 0.4% saline, 0.3% glycine and the like. These solutions are sterile and 
generally free of particulate matter. These solutions may be sterilized by conventional, well-known sterilization tech- 
niques. The compositions may contain pharmaceutical^ acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, etc. The concentration of the modulator of the 

10 invention in such pharmaceutical formulation can vary widely i.e., from less than about 0.5%, usually at or at least 
about 1% to as much as 15 or 20% by weight and will be selected primarily based on fluid volumes, viscosities, etc. 
according to the particular mode of administration selected. 

Thus, a pharmaceutical composition of the modulator of the invention for intramuscular injection could be prepared 
to contain 1 ml_ sterile buffered water, and 50 mg of a protein of the invention. Similarly, a pharmaceutical composition 

15 of the modulator of the invention for intravenous infusion could be made up to contain 250 mL of sterile Ringer's solution, 
and 150 mg of a modulator of the invention. Actual methods for preparing parenteral^ administrable compositions are 
well known or will be apparent to those skilled in the art and are described in more detail in, for example, Remington's 
Pharmaceutical Science, 15th ed., Mack Publishing Company, Easton, Pennsylvania. 

The physician will determine the dosage of the present therapeutic agents which will be most suitable and it will 

20 vary with the form of administration and the particular compound chosen, and furthermore, it will vary with the particular 
patient under treatment. Generally, the physician will wish to initiate treatment with small dosages substantially less 
than the optimum dose of the compound and increase the dosage by small increments until the optimum effect under 
the circumstances is reached. It will generally be found that when the composition is administered orally, larger quan- 
tities of the active agent will be required to produce the same effect as a smaller quantity given parenterally. The 

25 therapeutic dosage will generally be from 0.1 to 1000 milligrams per day and higher although it may be administered 
in several different dosage units. 

Depending on the patient condition, the pharmaceutical composition of the invention can be administered for pro- 
phylactic and/or therapeutic treatments. In therapeutic applications, compositions containing the present compounds 
or a cocktail thereof are administered to a patient already suffering from a disease in an amount sufficient to cure or 

30 at least partially arrest the disease and its complications. In prophylactic applications, compositions containing the 
present compounds or a cocktail thereof are administered to a patient not already in a disease state to enhance the 
patient's resistance to the disease. 

Single or multiple administrations of the pharmaceutical compositions can be carried out with dose levels and 
pattern being selected by the treating physician. In any event, the pharmaceutical composition of the invention should 

35 provide a quantity of the modulators of the invention sufficient to effectively treat the patient. 

Additionally, some diseases result from inherited defective genes. These genes can be detected by comparing the 
sequence of the defective gene with that of a normal one. Individuals carrying mutations in the PSP1 or D87258 gene 
may be detected at the DNA level by a variety of techniques. Nucleic acids used for diagnosis (genomic DNA, rnRNA, 
etc.) may be obtained from a patient's cells, such as from blood, urine, saliva or tissue biopsy, e.g., chorionic villi 

40 sampling or removal of amniotic fluid cells and autopsy material. The genomic DNA may be used directly for detection 
or may be amplified enzymatically by using PCR, ligase chain reaction (LCR), strand displacement amplification (SDA), 
etc. prior to analysis. See, e.g., Saiki etal., Nature, 324, 163-166 (1986), Bej, etal., Crit. Rev. Biochem. Molec. Biol., 
26, 301-334 (1991), Birkenmeyer etal., J. Virol. Meth., 35, 117-126 (1991), van Brunt, J., Bio/Technology, 8, 291-294 
(1990)). RNA or cDNA may also be used for the same purpose. As an example, PCR primers complementary to the 

45 nucleic acid of the instant invention can be used to identify and analyze PSP1 or D87258 mutations. For example, 
deletions and insertions can be detected by a change in size of the amplified product in comparison to the normal 
PSP1 or D87258 genotype. Point mutations can be identified by hybridizing amplified DNA to rabiolabeled PSP1 or 
D87258 RNA of the invention or alternatively, radiolabelled PSP1 or D87258 antisense DNA sequences of the invention. 
Perfectly matched sequences can be distinguished from mismatched duplexes by RNase A digestion or by differences 

50 jn melting temperatures (Tm). Such a diagnostic would be particularly useful for prenatal and even neonatal testing. 

In addition, point mutations and other sequence differences between the reference gene and "mutant" genes can 
be identified by yet other well-known techniques, e.g., direct DNA sequencing, single-strand conformational polymor- 
phism. See Orita etal., Genomics, 5, 874-879 (1989). For example, a sequencing primer is used with double-stranded 
PCR product or a single-stranded template molecule generated by a modified PCR. The sequence determination is 

55 performed by conventional procedures with radiolabeled nucleotides or by automatic sequencing procedures with flu- 
orescent-tags. Cloned DNA segments may also be used as probes to detect specific DNA segments. The sensitivity 
of this method is greatly enhanced when combined with PCR. Further, point mutations and other sequence variations, 
such as polymorphisms, can be detected as described above, e.g., through the use of allele-specific oligonucleotides 
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for PCR amplification of sequences that differ by single nucleotides. Oligonucleotides having sequences as set forth 
in SEQ ID Nos: 32, 33, 34, 35, 36, 37, 38, 39 and 40 are useful in such a method. These methods are useful for 
determining the genetic predisposition to neurodegeneration in a patient by detecting polymorphisms within PSP1 or 
D87258 in a sample from a patient. Preferably, the polymorphisms detected are at nucleotide 672 of PSP1 , at nucleotide 

s 1435 of PSP1 or at nucleotide 1 325 of D87258. Preferably, the polymorphisms are detected by PCR; most preferably, 
the polymorphisms are detected by PCR with oligonucleotides having a nucleotide sequence selected from the group 
consisting of SEQ ID NOs: 32, 33, 34, 35, 36, 37, 38, 39 and 40. Preferably, the neurodegeneration predisposition 
determined is to Alzheimer's disease. 

Genetic testing based on DNA sequence differences may be achieved by detection of alteration in electrophoretic 

10 mobility of DNA fragments in gels with or without denaturing agents. Small sequence deletions and insertions can be 
visualized by high resolution gel electrophoresis. DNA fragments of different sequences may be distinguished on de- 
naturing formamide gradient gels in which the mobilities of different DNA fragments are retarded in the gel at different 
positions according to their specific melting or partial melting temperatures. See, e.g., Myers era/., Science, 230, 1242 
(1985). In addition, sequence alterations, in particular small deletions, may be detected as changes in the migration 

is pattern of DNA heteroduplexes in non-denaturing gel electrophoresis such as heteroduplex electrophoresis. See, e. 
g., Nagamine et al, Am. J. Hum. Genet, 45, 337-339 (1989). Sequence changes at specific locations may also be 
revealed by nuclease protection assays, such as RNase and S1 protection or the chemical cleavage method as dis- 
closed by Cotton et al. in Proc. Natl. Acad. Sci. USA, 85, 4397-4401 (1 985). 

Thus, the detection of a specific DNA sequence may be achieved by methods such as hybridization (e.g., heter- 

20 oduplex electroporation, see, Wh rte etal., Genomics, 12, 301 -306 ( 1 992), RNAse protection (e. g. , Myers et al. , Science, 
230, 1242 (1985)) chemical cleavage (e.g., Cotton et ai, Proc. Natl. Acad. Sci. USA, 85, 4397-4401 (1985)), direct 
DNA sequencing, or the use of restriction enzymes (e.g., restriction fragment length polymorphisms (RFLP) in which 
variations in the number and size of restriction fragments can indicate insertions, deletions, presence of nucleotide 
repeats and any other mutation which creates or destroys an endonuclease restriction sequence). Southen b totting of 

25 genomic DNA may also be used to identify large (i.e., greater than 100 base pair) deletions and insertions. 

In addition to conventional gel electrophoresis and DNA sequencing, mutations such as microdeletions, aneuploi- 
dies, translocations, inversions, can also be detected by in situ analysis. See, e.g., Keller etal, DNA Probes, 2nd Ed., 
Stockton Press, New York, N.Y., USA (1993). That is, DNA or RNA sequences in cells can be analyzed for mutations 
without isolation and/or immobilization onto a membrane. Fluorescence in situ hybridization (FISH) is presently the 

30 most commonly applied method and numerous reviews of FISH have appeared. See, e.g., Trachuck et al., Science, 
250, 559-562 (1990), and Trask etal, Trends, Genet., 7, 149-154 (1991). Hence, by using nucleic acids based on the 
structure of the PSPI or D87258 genes, one can develop diagnostic tests for genetic mutations. 

In addition, some diseases are a result of, or are characterized by, changes in gene expression which can be 
detected by changes in the mRNA. Alternatively, the PSP1 or D87258 gene can be used as a reference to identify 

35 individuals expressing an increased or decreased level of PSP1 or D87258 mRNA, e.g., by Northern blotting or in situ 
hybridization. 

Defining appropriate hybridization conditions is within the skill of the art. See, e.g., "Current Protocols in Mol. Biol. 
■ Vol. I & II, Wiley Interscience. Ausbel et al (eds.) (1992). Probing technology is well known in the art and it is appre- 
ciated that the size of the probes can vary widely but it is preferred that the probe be at least 15 nucleotides in length. 

40 it is also appreciated that such probes can be and are preferably labeled with an analytically detectable reagent to 
facilitate identification of the probe. Useful reagents include but are not limited to radioisotopes, fluorescent dyes or 
enzymes capable of catalyzing the formation of a detectable product. As a general rule, the more stringent the hybrid- 
ization conditions the more closely related genes will be that are recovered. 

The putative role of PSP1 or D87258 in presenilin biochemistry establishes yet another aspect of the invention 

45 which is gene therapy. "Gene therapy" means gene supplementation where an additional reference copy of a gene of 
interest is inserted into a patient's cells. As a result, the protein encoded by the reference gene corrects the defect and 
permits the cells to function normally, thus alleviating disease symptoms. The reference copy would be a wild-type 
form of the PSP1 or D87258 gene or a gene encoding a protein or peptide which modulates the activity of the endog- 
enous PSP1 or D87258. 

50 Gene therapy of the present invention can occur in vivo or ex vivo. Ex vivo gene therapy requires the isolation and 

purification of patient cells, the introduction of a therapeutic gene and introduction of the genetically altered cells back 
into the patient. A replication-deficient virus such as a modified retrovirus can be used to introduce the therapeutic 
PSP1 or D87258gene into such cells. For example, mouse Moloney leukemia virus (MMLV) is a well-known vector in 
clinical gene therapy trials. See, e.g., Boris-Lauerie etal, Curr. Opin. Genet. Dev., 3, 102-109 (1993). 

55 in contrast, in vivo gene therapy does not require isolation and purification of a patient's cells. The therapeutic 

gene is typically "packaged" for administration to a patient such as in liposomes or in a replication-deficient virus such 
as adenovirus as described by Berkner, K.L, in Curr. Top. Microbiol Immunol, 158, 39-66 (1 992) or adeno-associated 
virus (AAV) vectors as described by Muzyczka, N., in Curr. Top. Microbiol Immunol, 158, 97-129 (1992) and U.S. 
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Patent No. 5,252,479. Another approach is administration of "naked DNA" in which the therapeutic gene is directly 
injected into the bloodstream or muscle tissue. Another approach is administration of "naked DNA" in which the ther- 
apeutic gene is introduced into the target tissue by microparticle bombardment using gold particles coated with the DNA. 

Cell types useful for gene therapy of the present invention include lymphocytes, hepatocytes, myoblasts, fibrob- 
5 lasts, any cell of the eye such as retinal cells, epithelial and endothelial cells. Preferably the cells are T lymphocytes 
drawn from the patient to be treated, hepatocytes, any cell of the eye or respiratory or pulmonary epithelial cells. 
Transfection of pulmonary epithelial cells can occur via inhalation of a neubulized preparation of DNA vectors in lipo- 
somes, DNA-protein complexes or replication -deficient adenoviruses. See, e.g., U.S. Patent No. 5,240,846. 

Another aspect of the invention is transgenic, non -human mammals capable of expressing the polynucleotides of 
10 the invention or D87258 in any cell. Transgenic, non-human animals may be obtained by transfecting appropriate 
fertilized eggs or embryos of a host with the polynucleotides of the invention, with D87258 or with mutant forms found 
in human diseases. See, e.g., U.S. Patent Nos. 4,736,866; 5,175,385; 5,175,384 and 5,175,386. The resultant trans- 
genic animal may be used as a model for the study of PSP1 or D87258gene function. Particularly useful transgenic 
animals are those which display a detectable phenotype associated with the expression of the PSP1 or D87258 protein. 
15 Drug development candidates may then be screened for their ability to reverse or exacerbate the relevant phenotype. 

The present invention will now be described with reference to the following specific, non-limiting examples. 

Example 1 - Identification of the PS-1 Binding Partner PSP1 

20 A portion of PS-1 cDNA (GenBank Accession No. L42110) (SEQ ID NO: 9) encoding residues 269-413 of the PS- 

1 amino acid sequence (SEQ ID NO: 10) was PCR amplified with the oligonucleotide primers 5'-CGGAATTCCGTAT- 
G CTG GTTG AAAC A-3' (SEQ ID NO: 11 ) and 5'-CGGGATCCTCAGGCTACGAAACAGGCTAT-3' (SEQ ID NO: 12). The 
product was digested with EcoRI and BamHI and cloned into pEG202 (Golemis etal, in Current Protocols in Molecular 
Biology, John Wiley & Sons, New York (1994)). The resulting plasmid, pCC352, encoded a fusion protein in which the 

25 DNA binding protein, LexA, was fused in-frame to amino acids 269-413 of PS-1. The parent vector, pEG202, was a 
yeast expression vector which uses the alcohol dehydrogenase (ADH1) promoter to express the LexA fusion proteins 
and HIS3as the selectable marker. Sequence analysis using an automated DNA sequencer (Applied Biosystems, Inc.) 
confirmed that the amplified region had the correct sequence and was fused in-frame to LexA 

All procedures, plasm ids and strains used in the two-hybrid screen have been described in detail by Golemis et 

30 al, supra. Yeast strain EGY48 (MATa, trp 1, his3, ura3, 6ops-LEU2) was cotransf ormed with the plasmids pCC352 and 
pSH18-34. Transformants were selected using complete minimal media lacking uracil and histidine. The plasmid 
pSH18-34 is a yeast expression vector in which eight LexA operator sites are located upstream of a minimal GAL1 
promoter which drives the expression of the LacZ gene and URA3 as a selectable marker. Synthesis of the full length 
LexA-PS-1 fusion was confirmed by Western blot analysis of yeast extracts using polyclonal antisera directed against 

35 LexA. It was confirmed that the LexA-PS-1 fusion alone was unable to activate neither the LEU2 nor LacZ reporter 
strains. In addition, the ability of the LexA-PS-1 fusion to enter the nucleus and bind DNA was confirmed using a 
repression assay. 

A strain containing the LexA-PS-1 fusion and pSH 1 8-34 (CCY321 ) was transformed with a human fetal brain cDNA 
library (Clontech) in plasmid pJG4-5 using a library scale transformation protocol. This library plasmid contains the 

40 TRP1 selectable marker and allows the expression of cDN As as fusions (AD fusions) to a cassette contain ing the SV40 
nuclear localization sequence, the acid blob B42, and the hemagglutinin epitope tag. See Gyuris et al., Cell 75, 791 -803 
(1 993). Expression of this fusion is under control of the galactose inducible promoter GALL Transformation reactions 
were plated onto complete minimal media lacking uracil, histidine and tryptophan. Approximately 4.5 x 10 6 individual 
transformants were obtained, pooled and frozen. To ensure that each primary colony was replated during the selection 

45 procedure, 2 x 1 0 7 viable cells (approximately 3 times the number of individual transformants) were plated onto minimal 
media lacking uracil, histidine, tryptophan and leucine with galactose/raffinose as the carbon source to induce expres- 
sion of AD fusions. Colonies arising after 3 and 4 days of growth at 30 °C were picked to complete minimal media 
lacking uracil, histidine and tryptophan. Colonies containing potential interacting fusion proteins were then tested for 
galactose dependence and LacZ expression. Those isolates which activated both the LEU2ar\6 LacZ reporters in a 

50 galactose dependent fashion were considered positive and pursued further. Plasmids were isolated from yeast, used 
to transform E. co//strain KC8, and AD fusion plasmids selected by growth on minimal E. co// media lacking tryptophan. 
Each AD fusion plasmid containing a potential interacting f usion was used to transform CCY321 . Several transformants 
were subjected to screening for galactose dependent LEU2 and LacZ activation. To ensure that the interaction was 
specific, the ability of each AD fusion plasmid to interact with 22 nonrelated LexA fusion proteins was tested. AD fusion 

55 plasmids which passed this second round of screening and interacted specifically with the LexA-PS-1 fusion were 
identified. 
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Example 2 - PSP1 cDNA Cloning and Sequence Analysis 

The AD fusion pi asm ids were subjected to restriction digest analysis and sequencing as indicated above. Sequence 
analysis of one of the interacting fusion protein cDNAs revealed a 51 9 nucleotide open reading frame (SEQ ID NO: 1 ) 

5 encoding a 173 amino acid (SEQ ID NO: 2) protein starting with an GGA at position 2 and terminating with a TGA at 
position 523 of SEQ ID NO: 1 . GenBank searches using the BLASTX and BLASTN algorithms with the cDNA sequence 
or with the deduced amino acid sequence indicated homology to a portion of the E. coli serine protease htrA described 
by Lipinska era/., supra, (SEQ ID NOs: 13 and 14). This novel cDNA was designated PSP1. 

To obtain a greater portion of the cDNA, the oligonucleotide, S'-CTGGATGGGGAGGTGATTGGAGTG-S' (SEQ ID 

10 NO: 15) representing bp 83-106 of SEQ ID NO: 1, was used to screen a Superscript human brain cDNA library (Gibco 
BRL) using the Genetrapper cDNA positive selection system (Gibco BRL). Colonies were screened using whole cell 
PCR or standard hybridization conditions as described by Innis et al., PCR Protocols: A Guide to Methods and Appli- 
cations, Academic Press, San Diego, CA (1990) and Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd 
ed., Cold Spring Harbor Press, Cold Spring Harbor, NY (1989). Those isolates which contained PSPI were subjected 

is to restriction digest analysis and sequencing. The longest clones, SEQ ID NO: 3 and SEQ ID NO: 5 were sequenced 
in their entirety. 

Sequence analysis of SEQ ID NO: 3 revealed a 969 nucleotide open reading frame encoding a 323 amino acid 
(SEQ ID NO: 4) protein starting with a CCC at position 1 and terminating with a TGA at position 972 of SEQ ID NO: 
3. Sequence analysis of SEQ ID NO: 5 revealed a 1500 nucleotide open reading frame encoding a 423 amino acid 

20 (SEQ ID NO: 6) protein starting with an CTT at position 1 and terminating with a TGA at position 1 272 of SEQ ID NO: 5. 

A second round of screening was performed using the oligonucleotide, S'-GTCTCTGGGCCCCGGTTGTCTGTTG- 
3' (SEQ ID NO: 16) representing bp 5-28 of SEQ ID NO: 5; the library and screening protocol remained unchanged. 
In the second round of screening, the isolate designated SEQ ID NO: 7 contained the longest cDN A clone. Sequence 
analysis of SEQ ID NO: 7 revealed a 1 374 nucleotide open reading frame encoding a 458 amino acid (SEQ ID NO: 8) 

25 protein starting with an ATG at position 251 and terminating with a TGA at position 1627 of SEQ ID NO: 7. However, 
SEQ ID NO: 7 does not have a stop codon upstream from the potential initiation codon. To confirm that the predicted 
start codon is authentic, the 5" nucleotide sequence was extended with 5' RACE using "Marathon Ready" human brain 
cDNA (Cbntech) and a nested set of primers. A SEQ ID NO: 7 specific primer 5'-CCAACAGACAACCGGGCCCAGA- 
GACT-3' (SEQ ID NO: 20) and a 5' anchor primer-1 (Clontech) was used in the first PCR amplification and a SEQ ID 

30 NO: 7 specific primer S'-TGCCTCCTCGCCCGCCCTACTCAGA-S' (SEQ ID NO: 21 ) and 5' anchor primer-2 (Clontech) 
was used in the second PCR amplification. PCR products were T/A cloned into pCR2.1 (Invitrogen). Eighteen isolates 
with staggered 5' ends were analyzed and a 5' consensus sequence of 587 nucleotides was generated (SEQ ID NO: 
22). Alignment of SEQ ID NO: 22 and SEQ ID NO:7 to generate a consensus sequence (SEQ ID NO: 23) indicates 
that at nucleotide position 225 there is an in frame stop codon and the first methionine corresponds to that predicted 

35 in SEQ ID NO: 7. This gene is designated PSP1-2. 

Consensus full length sequences for the genes designated PSP1-1 (SEQ ID NOs: 24 and 25), PSP1-3 (SEQ ID 
NOs: 26 and 27) and PSP1-4 (SEQ ID NOs: 28 and 29) were generated from alignment of the 5' consensus sequence 
(SEQ ID NO: 22), other partial PSP1 clones, and with SEQ ID NOs: 7, 3 and 5, respectively. 

Alignment of the deduced amino acid sequence of PSP1-1 (SEQ ID NO: 25) to E. coli htrA (SEQ ID NO: 14) was 

40 accomplished using the BESTFIT algorithm (University of Wisconsin Genetics Computer Group). An approximate sim- 
ilarity of 55% and an identity of 33.5% at the amino acid level was observed and is shown in Fig. 1 (top, PSP1-1; 
bottom, E. coli htrA). The critical histidine and serine motif GXSXG conserved in all serine proteases is present in 
PSP1-1 at amino acid positions 198 and 304-308, respectively, and are indicated in bold. Amino acid numbers are 
indicated at the left and right of the sequence alignment. 

45 Nucleotide sequence comparison of PSP1-2, PSP1-1, PSP1-3 and PSP1-4 using the PILEUP and PRETTY algo- 

rithms (University of Wisconsin Genetics Computer Group) with gap creation and extension penalties of 5.0 and 0.3, 
respectively, is shown in Fig. 2. The alignment results indicate that at nucleotide position 1 541 of the alignment, PSP1-2 
and PSP1-1 contain a 225 bp deletion and PSP1 -4 contains a 1 95 bp deletion. Within the same alignment at nucleotide 
position 1942, PSP1-4 lacks 96 bp that are present in PSP1-2, PSP1-1 and PSP1-3. At the junction of each deletion 

50 site there is a splice site consensus sequence AGG or TGG (indicated in bold), suggesting that these alternate forms 
are due to alternative splicing. See Mount.S. in Nucl Acids Res 10, 458-472 (1982). The apparent splicing event at 
position 1541 results in the removal of a stop codon (underlined in Fig. 2) that is present in PSP1-3. In addition, PSP1-2 
and PSP1-1 contain a single nucleotide difference at position 672 of the alignment. PSP1 -2 contains a T at this position 
producing the codon TGC which codes for a cysteine while PSP1-1 contains a C at the same position producing the 

55 codon CGC which codes for a cysteine. 

Nucleotide sequence comparison of PSP1-1 (SEQ ID NO: 24) to the putative human serine protease of Ohno et 
al, supra, (SEQ ID NO: 17) indicated a 49% identity using the GAP algorithm and 65% using the BESTFIT algorithm 
(data not shown). Alignment of the deduced amino acid sequence of PSP1-1 (SEQ ID NO: 25) to the D87258 protease 
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of Ohno et ai, supra, (SEQ ID NO: 18) was accomplished using the BESTFIT algorithm and is shown in Fig. 3 (top, 
PSP1 -1 ; bottom, Ohno et ai D87258 protease). An approximate identity of 46% at the amino acid level was observed. 

Example 3 - Tissue Distribution of PSP1 

5 

Northern analysis was carried out to determine the distribution of PSP1 mRNA in human tissues. A 30-base oli- 
gonucleotide probe directed against the PSP1 sequence was used (5'-ATGCTGAACATCGGGAAAGCTTGGTTCTCG- 
3') (SEQ ID NO: 19). This probe was 3'-end labelled with [^PJ-dATP. Northern blots containing mRNA from multiple 
human tissues (Clontech #7750-1 , #7760-1 , and #7755-1 ) were hybridized with this probe under stringent conditions. 
10 A major band of approximately 1 .9kb was detected in all regions investigated: heart, brain, lung, placenta, liver, skeletal 
muscle, kidney, pancreas, amygdala, caudate nucleus, corpus callosum, hippocampus, substantia nigra, subthalamic 
nucleus, thalamus, cerebellum, cerebral cortex, medulla, spinal cord, occipital pole, frontal lobe, temporal pole, and 
putamen. PSP1 mRNA was also detected in Alzheimer's disease brain. 

is Example 4 - Detecting the PSP1 polymorphisms 

PSP1 oligonucleotides 1AFC, 1 AFT and 1AR were designed for detecting the polymorphism at nucleotide 672 
(cytidine to thymine) causing the Arg to Cys amino acid change. The Allele Specific Oligonucleotides (ASO) 1 AFC and 
1 AFT are identical apart from their 3' end bases and provide the specificity for screening for the polymorphism. 

20 

1 AFC: CAT CCG GCA TTG TTA GCT CTG C 22mer (SEQ ID NO:32) 
1 AFT: CAT CCG GCA TTG TTA GCT CTG T 22mer (SEQ ID NO:33) 
25 1 AR: CAA TAG CTG CAT CAG TTT GAA TG 23mer (SEQ ID NO:34) 

Pairs of oligonucleotides (1 AFC + 1 AR, or 1 AFT + 1 AR) were used in a PC R under the following conditions: 94° C 
for 40 seconds, 60° C for 30 seconds, for 35 cycles in a reaction containing 1 U KlenTaql (GenPak Ltd.), 50mM Tris- 

30 CI pH9.1, 16mM ammonium sulphate, 3.5mM MgCI 2 , 150ug ml" 1 BSA and 25ng of human genomic DNAof unknown 
source. Each pair of oligonucleotides was tested against 12 random samples of genomic DNA and the products elec- 
trophoresed on a 4% agarose (Gibco-BRL) gel. The expected product of 95 base pairs was seen for both ASOs in 8 
of the 12 DNAs indicating that these individuals are heterozygous for this polymorphism. Two of the ONAs amplified 
with only the 1 AFC oligonucleotide and are thus homozygous for the allele with the cytidine at this position. Two of the 

35 DNAs amplified with only the 1 AFT oligonucleotide and are thus homozygous for the allele with the thymine at this 
position. 

PSP1 oligonucleotides 1BFC, 1BFT and 1BR were designed for detecting the polymorphism at nucleotide 1435 
(cytidine to thymine) causing the Ala to val amino acid change. 

40 

1 BFC: TGG CGG GCT TTG GGG GGC ATT C 22mer (SEQ ID NO:35) 
1 BFT: TGG CGG GCT TTG GGG GGC ATT T 22mer (SEQ ID NO:36) 
1 BR: G AC GTC AGC AGG GCC CGG AGG TC 23mer (SEQ ID NO:37) 

45 

Pairs of oligonucleotides (1 BFC + 1 BR, or 1 BFT + 1 BR) were used in a PCR under the following conditions:94°C 
for 40 seconds, 67°C for 30 seconds, for 35 cycles in a reaction containing 1 U KlenTaql (GenPak Ltd.), 50mM Tris- 
Cl pH9.1 , 16mM ammonium sulphate, 3.5mM MgC^, 150ug ml* 1 BSA and 25ng of human genomic DNA of unknown 
50 source. Each pair of oligonucleotides was tested against 1 2 random samples of genomic DNA and the products elec- 
trophoresed on a 4% agarose (Gibco-BRL) gel. The expected product of 75 base pairs was seen using the 1 BFT ASO 
in 9 of the 12 samples indicating that the other 3 individuals have a different allele at this position. 

Example 5 - Detecting the D87258 polymorphism 

55 

Oligonucleotides 2AFG, 2 AFT and 2AR were designed for detecting the polymorphism at nucleotide 1 325 (guanine 
to thymine) causing the Gly to Val amino acid change. 
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2AFG: GAT ACC CCA GCA GAA GCT GG 20mer (SEQ ID NO:38) 
2 AFT: GAT ACC CCA GCA GAA GCT GT 20mer (SEQ ID NO:39) 
2AR: GCT GAC ATC ATT GGC GGA GAC 2 1 mer (SEQ ID NO:40) 

Pairs of oligonucleotides (2AFG + 2AR, or 2AFT + 2AR) were used in a PCR under the following conditions: 94°C 
for 40 seconds, 62° C for 30 seconds, for 35 cycles in a reaction containing 1 U KlenTaql (GenPak Ltd.), 50m M Tris- 
Cl pH9.1, 16mM ammonium sulphate, 3.5mM MgC^, 150ug mh 1 BSAand 25ng of human genomic DNA of unknown 
source. Each pair of oligonucleotides was tested against 12 random samples of genomic DNA and the products elec- 
trophoresed on a 4% agarose (Gibco-BRL) gel. The 2AFT ASO generated a band of approximately 1000 bp. The 
predicted band was 90 bp. Presumably, the presence of the larger bands was due to the presence of an intron in the 
region flanked by oligonucletides 2AR and 2 AFT Bands were observed in all of the samples amplified with 2AFT 
indicating that the allele containing the thymine is present in all 12 individuals. 

The present invention may be embodied in other specific forms without departing from the spirit or essential at- 
tributes thereof, and, accordingly, reference should be made to the appended claims, rather than to the foregoing 
specification, as indicating the scope of the invention. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i) APPLICANT: Creasy, Caretha 
Livi, George 
Karran, Eric 
CI J nkenbeard, Helen 
Browne. Michael 
Southan, Christopher 

(ii} TITLE OF THE INVENTION: HUMAN SERINE PROTEASE 

<iii) NUMBER OF SEQUENCES: 40 

<iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SmithKline Beecham Corporation 

(B) STREET: 709 Swedeland Road 

(C) CITY: King of Prussia 
{D} STATE: PA 

{£) COUNTRY: USA 
(F) ZIP: 19406 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ Version 1.5 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/025436 

(B) FILING' DATE: 06-SEPT-1996 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Bauroeister , Kirk 

(B) REGISTRATION NUMBER: 33,833 

(C) REFERENCE/DOCKET NUMBER: P50547P2 

<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 610-270-5096 

(B) TELEFAX: 610-270-5090 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 732 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
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GGGACTCCCC CAAACCAATG TGGAATACAT 
CTCTGGAGGT CCCCTGGTTA ACCTGGATGG 

CACAGCTGGA atctcctttg ccatcccttc 

GGAAAAGAAG AATTCCTCCT CCGGAATCAG 
GATGCTGACC CTGAGTCCCA GCATCCTTGC 
CGATGTTCAG CATGGTGTAC TCATCCATAA 
TGGTCTGCGG CCTGGTGATG TGATTTTGGC 
AGATGTTTAT GAAGCTGTTC GAACCCAATC 
AGAAACACTG ACCTTATATG TGACCCCTGA 
TGAGGCTCCT GCTCTGATTT CCTCCTTGCC 
CAGAGGGTTA AATGAACCAG TGGGGGCAGG 
CTCTGAAGAA TCACAGAAAC ACTTTTTATA 
AAAAAAAAAA AA 



TCAAACTGAT GCAGCTATTG ATTTTGGAAA 
GGAGGTGATT GGAGTGAACA CCATGAAGGT 
TGATCGTCTT CGAGAGTTTC TGCATCGTGG 
TGGGTCCCAG CGGCGCTACA TTGGGGTGAT 
TGAACTACAG CTTCGAGAAC CAAGCTTTCC 
AGTCATCCTG GGCTCCCCTG CACACCGGGC 
CATTGGGGAG CAGATGGTAC AAAATGCTGA 
CCAGTTGGCA GTGCAGATCC GGCGGGGACG 
GGTCACAGAA TGAATAGATC ACCAAGAGTA 
TTTCTGGCTG AGGTTCTGAG GGCACCGAGA 
TCCCTCCAAC CACCAGCACT GACTCCTGGG 
TAAAATAAAA TTATACCTAG CAACAAAAAA 



(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 173 amino acids 
(b) type: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: N- terminal 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



Gly Leu Pro Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala He 

15 10 15 

Asp Phe Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val 

20 25 30 

He Gly Val Asn Thr Met Lys Val Thr Ala Gly He Ser Phe Ala He 

35 AO 45 

Pro Ser Asp Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn 

50 55 60 

Ser Ser Ser Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met 
65 70 75 80 

Met Leu Thr Leu Ser Pro Ser He Leu Ala Glu Leu Gin Leu Arg Glu 

85 90 95 

Pro Ser Phe Pro Asp Val Gin His Gly Val Leu He His Lys val He 

100 105 110 

Leu Gly Ser Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val He 

115 120 125 

Leu Ala He Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu 

130 135 140 

Ala Val Arg Thr Gin Ser Gin Leu Ala Val Gin lie Arg Arg Gly Arg 
145 150 155 160 

Glu Thr Leu Thr Leu Tyr val Thr Pro Glu val Thr Glu 
165 170 

(2) INFORMATION FOR SEQ ID NO: 3: 



Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 178.7 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 
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(vi) ORIGINAL SOURCE : 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

5 CCCAGTCTCT CCGCCCGGTT GTCTGTTGGG GTCACTGAAC CCCGAGCATG CCTGACGTCT 60 

GGGACCCCGG GTCCCCCGGC ACAACTGACT GCGGTGACCC CAGATACCAG GACCCGGGAG 120 

GCCTCAGAGA ACTCTGGAAC CCGTTCGCGC GCGTGGCTGG CGGTGGCGCT GGGCGCTGGG 180 

GGGGCAGTGC TGTTGTTGTT GTGGGGCGGG GGTCGGGGTC CTCCGGCCGT CCTCGCCGCC 240 

GTCCCTAGCC CGCCGCCCGC TTCTCCCCGG AGTCAGTACA ACTTCATCGC AGATGTGGTG 300 

GAGAAGACAG CACCTGCCGT GGTCTATATC GAGATCCTGG ACCGGCACCC TTTCTTGGGC 360 

10 CCCGAGGTCC CTATCTCGAA CGGCTCAGGA TTCGTGGTGG CTGCCGATGG GCTCATTGTC 420 

ACCAACGCCC ATGTGGTGGC TGATCGGCGC AGAGTCCGTG TGAGACTGCT AAGCGGCGAC 480 

ACGTATGAGG CCGTGGTCAC AGCTGTGGAT CCCGTGGCAG ACATCGCAAC GCTGAGGATT 540 

CAGACTAAGG AGCCTCTCCC CACGCTGCCT CTGGGACGCT CAGCTGATGT CCGGCAAGGG 600 

GAGTTTGTTG TTGCCATGGG AAGTCCCTTT GCACTGCAGA ACACGATCAC ATCCGGCATT 660 

GTTAGCTCTG CTCAGCGTCC AGCCAGAGAC CTGGGACTCC CCCAAACCAA TGTGGAATAC 720 

j 5 ATTCAAACTG ATGCAGCTAT TGATTTTGGA AACTCTGGAG GTCCCCTGGT TAACCTGGTG 780 

AGTGAGACAT CCTTCCTTCC AAGAATCCCT GCCCCAGGTC AGTGTGGGAA GGGTAGGTTT 840 

CCCCTAATTC AAGGATGTTT GGTCAAGTTT CTGAGCAGTT CTTTGTTGGC TATCTCTCAA 900 

TATCCAACCA GATCTCCCCA ACACTTGCTG GTACTTTTGT TCGGGTGCCC CCATCCCCTA 960 

CTATTTGTTT AGGCTAGGGA ACTGGGGGCT GTATCCCTGC AGGATGGGGA GGTGATTGGA 1020 

GTGAACACCA TGAAGGTCAC AGCTGGAATC TCCTTTGCCA TCCCTTCTGA TCGTCTTCGA 1080 

20 GAGTTTCTGC ATCGTGGGGA AAAGAAGAAT TCCTCCTCCG GAATCAGTGG GTCCCAGCGG 1140 

CGCTACATTG GGGTGATGAT GCTGACCCTG AGTCCCAGCA TCCTTGCTGA ACTACAGCTT 1200 

CGAGAACCAA GCTTTCCCGA TGTTCAGCAT GGTGTACTCA TCCATAAAGT CATCCTGGGC 1260 

TCCCCTGCAC ACCGGGCTGG TCTGCGGCCT GGTGATGTGA TTTTGGCCAT TGGGGAGCAG 1320 

ATGGTACAAA ATGCTGAAGA TGTTTATGAA GCTGTTCGAA CCCAATCCCA GTTGGCAGTG 1380 

CAGATCCGGC GGGGACGAGA AACACTGACC TTATATGTGA CCCCTGAGGT CACAGAATGA 1440 

25 ATAGATCACC AAGAGTATGA GGCTCCTGCT CTGATTTCCT CCTTGCCTTT CTGGCTGAGG 1500 

TTCTGAGGGC ACCGAGACAG AGGGTTAAAT GAACCAGTGG GGGCAGGTCC CTCCAACCAC 1560 

CAGCACTGAC TCCTGGGCTC TGAAGAATCA CAGAAACACT TTTTATATAA AATAAAATTA 1620 

TACCTAGCAA CATATTATAG TAAAAAATGA GGTGGGAGGG CTGGATCTTT TCCCCCACCA 1680 

AAAGGCTAGA GGTAAAGCTG TATCCCCCTA AACTTAGGGG AGATACTGGA GCTGACCATC 1740 

CTGACCTCCT ATTAAAGAAA ATGAGCTGCT GAAAAAAAAA AAAAAAA 1787 

30 

(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 323 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDWESS : single 
35 <D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: N- terminal 
40 (vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



45 



50 



55 



Pro Ser Leu 
1 

Cys Leu Thr 

Thr Pro Asp 

35 

Ser Arg Ala 
50 

Leu Leu Leu 
65 

val Pro Ser 
Ala Asp val 
Leu Asp Arg 



Trp Ala Arg 

5 

Ser Gly Thr 
20 

Thr Arg Thr 

Trp Leu Ala 

Trp Gly Gly 
70 

Pro Pro Pro 
85 

Val Glu Lys 
100 

His Pro Phe 



Leu Ser 

Pro Gly 

Arg Glu 

40 
val Ala 
55 

Gly Arg 
Ala Ser 
Thr Ala 
Leu Gly 



Val Gly 

10 
Pro Arg 
25 

Ala Ser 

Leu Gly 

Gly Pro 

Pro Arg 

90 
Pro Ala 
105 

Arg Glu 



Val Thr Glu Pro 
Ala Gin 
Glu Asn 



Ala Gly 

60 
Pro Ala 
75 

Ser Gin 
val val 
Val Pro 



Leu Thr 
30 

Ser Gly 
45 

Gly Ala 



Val Leu 

Tyr Asn 

Tyr lie 
110 
lie Ser 



Arg Ala 
15 

Ala Val 

Thr Arg 

Val Leu 

Ala Ala 

80 
Phe lie 
95 

Glu He 
Asn Gly 
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115 










120 










125 










Ser 


Gly 
130 


Phe 


val 


Val 


Ala 


Ala 
135 


Asp 


Gly 


Leu 


He 


Val 
140 


Thr 


Asn 


Ala 


His 




Val 


val 


Ala 


ASp 


Arg 


Arg 


Arg 


val 


Arg 


Val 


Arg 


Leu 


Leu 


Ser 


Gly 


ASp 


5 


145 










150 










155 










160 




Thr 




Glu 


Ala 


Val 


Val 


Thr 


Ala 


val 


Asp 


Pro 


Val 


Ala 


ASp 


He 


Ala 










165 










170 










175 






Thr 


Leu 


Arg 


He 
180 


Gin 


Thr 


Lys 


Glu 


Pro 
185 


Leu 


Pro 


Thr 


Leu 


Pro 
190 


Leu 


Gly 




Arg 


Ser 


Ala 


ASP 


val 


Arg 


Gin 


Gly 


Glu 


Phe 


val 


Val 


Ala 


Met 


Gly 


Ser 


10 






195 










200 










205 










Pro 


Phe 
210 


Ala 


Leu 


Gin 


Asn 


Thr 
215 


He 


Thr 


Ser 


Gly 


He 
220 


val 


Ser 


Ser 


Ala 




Gin 


Arg 


pro 


Ala 


Arg 


Asp 


Leu 


Gly 


Leu 


Pro 


Gin 


Thr 


Asn 


val 


Glu 


Tyr 




225 










230 










235 










240 




lie 


Gin 


Thr 


Asp 


Ala 


Ala 


lie 


ASp 


Phe 


Gly Asn 


Ser 


Gly 


Gly 


Pro 


Leu 


15 










245 










250 










255 






val 


Asn 


Leu 


Val 
260 


Ser 


Glu 


Thr 


Ser 


Phe 

265 


Leu 


pro 


Arg 


He 


pro 

270 


Ala 


Pro 




Gly 


Gin 


Cys 


Gly 


Lys 


Gly 


Arg 


Phe 


Pro 


Leu 


He 


Gin 


Gly 


Cys 


Leu 


val 






275 










280 










285 










Lys 


Phe 


Leu 


Ser 


Ser 


Ser 


Leu 


Leu 


Ala 


He 


Ser 


Gin 


Tyr 


Pro 


Thr 


Arg 


20 


290 










295 










300 












Ser 


Pro 


Gin 


His 


Leu 


Leu 


Val 


Leu 


Leu 


Phe 


Gly Cys 


Pro 


His 


Pro 


Leu 



305 310 315 320 

Leu Phe val 



25 (2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1503 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CTTCGGGCAT GGCGGGCTTT GGGGGGCATT CGCTGGGGGA GGAGACCCCG TTTGACCCCT 60 

GACCTCCGGG CCCTGCTGAC GTCAGGAACT TCTGACCCCC GGGCCCGAGT GACTTATGGG 120 

ACCCCCAGTC TCTGGGCCCG GTTGTCTGTT GGGGTCACTG AACCCCGAGC ATGCCTGACG 180 

TCTGGGACCC CGGGTCCCCG GGCACAACTG ACTGCGGTGA CCCCAGATAC CAGGACCCGG 240 

GAGGCCTCAG AGAACTCTGG AACCCGTTCG CGCGCGTGGC TGGCGGTGCC GCTGGGCGCT 300 

GGGGGGGCAG TGCTGTTGTT GTTGTGGGGC GGGGGTCGGG GTCCTCCGGC CGTCCTCGCC 360 

GCCGTCCCTA GCCCGCCGCC CGCTTCTCCC CGGAGTCAGT ACAACTTCAT CGCAGATGTG 420 

GTGGAGAAGA CAGCACCTGC CGTGGTCTAT ATCGAGATCC TGGACCGGCA CCCTTTCTTG 480 

GGCCGCGAGG TCCCTATCTC GAACGGCTCA GGATTCGTGG TGGCTGCCGA TGGGCTCATT 540 

GTCACCAACG CCCATGTGGT GGCTGATCGG CGCAGAGTCC GTGTGAGACT GCTAAGCGGC 600 

GACACGTATG AGGCCGTGGT CACAGCTGTG GATCCCGTGG CAGACATCGC AACGCTGAGG 660 

ATTCAGACTA AGGAGCCTCT CCCCACGCTG CCTCTGGGAC GCTCAGCTGA TGTCCGGCAA 720 

GGGGAGTTTG TTGTTGCCAT GGGAAGTCCC TTTGCACTGC AGAACACGAT CACATCCGGC 780 

ATTGTTAGCT CTGCTCAGCG TCCAGCCAGA GACCTGGGAC TCCCCCAAAC CAATGTGGAA 840 

TACATTCAAA CTGATGCAGC TATTGATTTT GGAAACTCTG GAGGTCCCCT GGTTAACCTG 900 

GCTAGGGAAC TGGGGGCTGT ATCCCTGCAG GATGGGGAGG TGATTGGAGT GAACACCATG 960 

AAGGTCACAG CTGGAATCTC CTTTGCCATC CCTTCTGATC GTCTTCGAGA GTTTCTGCAT 1020 

CGTGGGGAAA AGAAGAATTC CTCCTCCGGA ATCAGTGGGT CCCAGCGGCG CTACATTGGG 1080 

GTGATGATGC TGACCCTGAG TCCCAGGGCT GGTCTGCGGC CTGGTGATGT GATTTTGGCC 1140 

ATTGGGGAGC AGATGGTACA AAATGCTGAA GATGTTTATG AAGCTGTTCG AACCCAATCC 1200 

CAGTTGGCAG TGCAGATCCG GCGGGGACGA GAAACACTGA CCTTATATGT GACCCCTGAG 1260 
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GTCACAGAAT GAATAGATCA CCAAGAGTAT GAGGCTCCTG CTCTGATTTC CTCCTTGCCT 1320 

TTCTGGCTGA GGTTCTGAGG GCACCGAGAC AGAGGGTTAA ATGAACCAGT GGGGGCAGGT 1380 

CCCTCCAACC ACCAGCACTG ACTCCTGGGC TCTGAAGAAT CACAGAAACA CTTTTTATAT 1440 

AAA AT AAA AT TATACCTAGC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1500 

5 AAA !503 

(2) INFORMATION FOR SEQ ID NO: 6; 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 423 amino acids 

10 (b) TYPE: amino acid 

(C) STRANDEDNESS : Single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
15 (iv) ANTISENSK: NO 

(v) FRAGMENT TYPE: N- terminal 

(vi) ORIGINAL SOURCE: 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

20 Leu Arg Ala Trp Arg Ala Leu Gly Gly lie Arg Trp Gly Arg Arg Pro 

15 10 15 

Arg Leu Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp 

20 25 30 

Pro Arg Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu 
35 40 45 

25 ser Val Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro 

50 55 60 

Gly Pro Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg 
65 70 75 80 

Glu Ala Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val 
85 90 95 

30 Ala Leu Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly 

100 105 110 

Arg Gly Pro Pro Ala val Leu Ala Ala val Pro Ser Pro Pro Pro Ala 

115 120 125 

Ser Pro Arg Ser Gin Tyr Asn Phe He Ala Asp val Val Glu Lys Thr 

130 135 140 

Ala Pro Ala val val Tyr He Glu He Leu Asp Arg His Pro Phe Leu 
145 150 155 160 

Gly Arg Glu Val Pro He Ser Asn Gly Ser Gly Phe Val Val Ala Ala 

165 170 175 

Asp Gly Leu He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg 

180 185 190 

Val Arg Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr 

195 200 205 

Ala val Asp Pro val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys 

210 215 220 

Glu Pro Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin 
225 230 235 240 

Gly Glu Phe val val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr 

245 250 255 

He Thr Ser Gly He val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu 

260 265 270 

Gly Leu Pro Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala He 
50 275 280 285 

Asp Phe Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Ala Arg Glu Leu 

290 295 300 

Gly Ala val Ser Leu Gin Asp Gly Glu Val lie Gly val Asn Thr Met 
305 310 315 320 

Lys Val Thr Ala Gly He Ser Phe Ala He Pro Ser Asp Arg Leu Arg 
55 3 2 5 3 3 0 3 3 5 

Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser Ser Gly He Ser 



35 



40 



45 
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340 345 350 



Gly 


Ser 


Gin 
355 


Arg 


Arg 


Tyr 


He 


Gly 
360 


val 


Met 


Met 


Leu 


Thr 
365 


Leu 


Ser 


Pro 


Arg 


Ala 
370 


Gly 


Leu 


Arg 


Pro 


Gly 
375 


ASp 


val 


He 


Leu 


Ala 
380 


He 


Gly 


Glu 


Gin 


Met 


Val 


Gin 


Asn 


Ala 


Glu 


Asp 


val 


Tyr 


Glu 


Ala 


val 


Arg 


Thr 


Gin 


Ser 


385 










390 










395 










400 


Gin 


Leu 


Ala 


Val 


Gin 
405 


He 


Arg 


Arg 


Gly 


Arg 
410 


Glu 


Thr 


Leu 


Thr 


Leu 
415 


Tyr 


val 


Thr 


Pro 


Glu 


val 


Thr 


Glu 





















10 420 

{2} INFORMATION FOP SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 
(A> LENGTH: 1835 base pairs 
75 (B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
Ux) FEATURE: 

2S (A) NAME /KEY : Coding Sequence 

(B) LOCATION: 251... 1624 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

30 

GGCCGGAAGG GCTAGCGGTC CCAGCATACC CCGCGGCCCC TTGGGCCGTC TCACAACTCG 60 
CGTCCGGCGG AGACCACAAT TCCCGGCATT CGTGGGGCAT GGAGGAGTCG GCCTCCCGGA 120 
ATCCTGGTCC CGGCGTGCAC TTCTGAAGGA CTTCAGGTAC CGGCGTGCCC CGCGTCCTAC 180 
TGTCCGCCTG CTCGCGTCCT GGGTGCCGCC TCTGAGTAGG GCGGGCGAGG AGGCAGCCAA 240 
GGCGGAGCTG ATG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGG AGC 289 
35 Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser 

15 10 
CTT CGG GCA TGG CGG GCT TTG GGG GGC ATT TGC TGG GGG AGG AGA CCC 337 
Leu Arg Ala Trp Arg Ala Leu Gly Gly He Cys Trp Gly Arg Arg Pro 
15 20 25 

40 CGT TTG ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC 385 

Arg Leu Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp 
30 35 40 45 

CCC CGG GCC CGA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG 433 
Pro Arg Ala Arg val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu 
45 5 0 5 5 6 0 

TCT GTT GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG 481 
Ser Val Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro 
65 70 75 

50 GGT CCC CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG 529 

Gly Pro Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg 
80 85 90 

GAG GCC TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG 577 
ss Glu Ala Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala val 

95 100 105 
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GCG CTG GGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT 625 
Ala Leu Gly Ala Gly Gly Ala val Leu Leu Leu Leu Trp Gly Gly Gly 
110 115 120 125 

5 CGG GGT CCT CCG GCC GTC CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT 673 

Arg Gly Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala 
130 135 140 

TCT CCC CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA 721 
Ser Pro Arg Ser Gin Tyr Asn Phe lie Ala Asp val val Glu Lys Thr 
70 145 150 155 

GCA CCT GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG 769 
Ala Pro Ala Val Val Tyr lie Glu lie Leu Asp Arg His Pro Phe Leu 
160 165 170 

15 GGC CGC GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC 817 

Gly Arg Glu Val Pro He Ser Asn Gly Ser Gly Phe val Val Ala Ala 
175 180 185 

GAT GGG CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA 865 
Asp Gly Leu He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg 
20 190 195 200 205 

GTC CGT GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA 913 
val Arg val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala val val Thr 
210 215 220 

25 GCT GTG GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG 961 

Ala Val asp Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys 
225 230 235 

GAG CCT CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA 1009 
Glu Pro Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin 
30 240 245 250 

GGG GAG TTT GTT GTT GCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG 1057 
Gly Glu Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr 
255 260 265 

35 ATC ACA TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG 1105 

He Thr Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu 
270 275 280 285 

GGA CTC CCC CAA ACC AAT GTG GAA TAC ATT CAA ACT GAT GCA GCT ATT 1153 
Gly Leu Pro Gin Thr Asn val Glu Tyr He Gin Thr Asp Ala Ala He 
40 290 295 300 

GAT TTT GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GAT GGG GAG GTG 1201 
Asp Phe Gly Asn Ser Gly Gly pro Leu val Asn Leu Asp Gly Glu val 
305 310 1 315 

45 ATT GGA GTG AAC ACC ATG AAG GTC ACA GCT GGA ATC TCC TTT GCC ATC 1249 

He Gly Val Asn Thr Met Lys Val Thr Ala Gly He Ser Phe Ala He 
320 325 330 

CCT TCT GAT CGT CTT CGA GAG TTT CTG CAT CGT GGG GAA AAG AAG AAT 1297 
Pro Ser Abp Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn 
50 335 340 345 

TCC TCC TCC GGA ATC AGT GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG 1345 
ser ser Ser Gly He Ser Gly Ser Gin Arq Arg Tyr He Gly val Met 
350 355 360 365 

55 ATG CTG ACC CTG AGT CCC AGC ATC CTT GCT GAA CTA CAG CTT CGA GAA 1393 
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Met Leu Thr Leu Ser Pro Ser Tie Leu Ala Glu Leu Gin Leu Arg Glu 
370 375 380 



CCA AGC TTT CCC GAT GTT CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC 
Pro Ser Phe Pro Asp Val Gin His Gly Val Leu lie His Lys Val lie 
385 390 395 



1441 



10 



IS 



20 



25 



CTG GGC TCC CCT GCA CAC CGG GCT GGT CTG CGG CCT GGT GAT GTG ATT 1489 
Leu Gly Ser Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val lie 
400 405 410 

TTG GCC ATT GGG GAG CAG ATG GTA CAA AAT GCT GAA GAT GTT TAT GAA 1537 
Leu Ala He Gly Glu Gin Met val Gin Asn Ala Glu Asp val Tyr Glu 
415 420 425 

GCT GTT CGA ACC CAA TCC CAG TTG GCA GTG CAG ATC CGG CGG GGA CGA 1585 
Ala Val Arg Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg 
430 435 440 445 

GAA ACA CTG ACC TTA TAT GTG ACC CCT GAG GTC ACA GAA TGAATAGATC ACC 1637 
Glu Thr Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 455 

AAGAGTATGA GGCTCCTGCT CTGATTTCCT CCTTGCCTTT CTGGCTGAGG TTCTGAGGGC 1697 

ACCGAGACAG AGGGTTAAAT GAACCAGTGG GGGCAGGTCC CTCCAACCAC CAGCACTGAC 1757 

TCCTGGGCTC TGAAGAATCA CAGAAACACT TTTTATATAA AATAAAATTA TACCTAGCAA 1817 

CATAAAAAAA AAAAAAAA 1835 

(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 458 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 





Met 


Ala 


Ala 


Pro 


Arg 


Ala 


Gly 


Arg 


Gly 


Ala 


Gly 


Trp 


Ser 


Leu 


Arg 


Ala 


40 


1 








5 










10 










15 






Trp 


Arg 


Ala 


Leu 


Gly 


Gly 


He 


Cys 


Trp 


Gly Arg 


Arg 


Pro 


Arg 


Leu 


Thr 










20 










25 










30 








Pro 


Asp 


Leu 
35 


Arg 


Ala 


Leu 


Leu 


Thr 
40 


Ser 


Gly 


Thr 


ser 


ASP 
45 


Pro 


Arg 


Ala 




Arg 


Val 


Thr 


Tyr 


Gly 


Thr 


Pro 


Ser 


Leu 


Trp 


Ala 


Arg 


Leu 


Ser 


val 


Gly 


45 




50 










55 










60 












Val 


Thr 


Glu 


Pro 


Arg 


Ala 


Cys 


Leu 


Thr 


Ser Gly 


Thr 


Pro 


Gly 


Pro 


Arg 




65 










70 










75 










80 




Ala 


Gin 


Leu 


Thr 


Ala 
85 


val 


Thr 


Pro 


Asp 


Thr 
90 


Arg 


Thr 


Arg 


Glu 


Ala 
95 


Ser 




Glu 


Asn 


Ser 


Gly 


Thr 


Arg 


Ser 


Arg 


Ala 


Trp 


Leu 


Ala 


val 


Ala 


Leu 


Gly 


50 








100 










105 










110 








Ala 


Gly 


Gly 


Ala 


Val 


Leu 


Leu 


Leu 


Leu 


Trp 


Gly 


Gly 


Gly Arg 


Gly 


Pro 








115 










120 










125 










Pro 


Ala 

130 


val 


Leu 


Ala 


Ala 


val 
135 


Pro 


Ser 


Pro 


Pro 


Pro 
140 


Ala 


Ser 


Pro 


Arg 




Ser 


Gin 


Tyr 


Asn 


Phe 


He 


Ala 


ASP 


val 


val 


Glu 


Lys 


Thr 


Ala 


Pro 


Ala 


55 


145 










150 










155 










160 




val 


val 


Tyr 


lie 


Glu 


He 


Leu 


Asp 


Arg 


His 


Pro 


Phe 


Leu 


Gly 


Arg 


Glu 
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165 170 175 



5 



10 



15 



20 



25 



30 



val 


Pro 


lie 


Ser 


Asn 


Gly 


Ser 


Gly 


Phe 


vol 


Val 


Ala 


Ala 


Asp 


Gly 


Leu 








180 










185 










190 






lie 


Val 


Thr 


Asn 


Ala 


His 


val 


val 


Ala 


ASP 


Arg 


Arg 


Arg 


val 


Arg 


Val 






195 










200 










205 








Arg 


Leu 


Leu 


Ser 


Gly 


Asp 


Thr 


Tyr 


Glu 


Ala 


Val 


val 


Thr 


Ala 


val 


Asp 




210 










215 










220 










Pro 


val 


Ala 


Asp 


lie 


Ala 


Thr 


Leu 


Arg 


He 


Gin 


Thr 


Lys 


Glu 


Pro 


Leu 


225 










230 










235 










240 


pro 


Thr 


Leu 


Pro 


Leu 


Gly 


Arg 


Ser 


Ala 


ASP 


Val 


Arg 


Gin 


Gly Glu 


Phe 










245 










250 










255 




Val 


val 


Ala 


Met 


Gly 


Ser 


Pro 


Phe 


Ala 


Leu 


Gin 


Asn 


Thr 


He 


Thr 


Ser 








260 










265 










270 






Gly 


He 


val 


Ser 


Ser 


Ala 


Gin 


Arg 


Pro 


Ala 


Arg 


ASP 


Leu 


Gly 


Leu 


Pro 






275 










280 










285 








Gin 


Thr 


Asn 


val 


Glu 


Tyr 


He 


Gin 


Thr 


Asp 


Ala 


Ala 


He 


Asp 


Phe 


Gly 




290 










295 










300 










Asn Ser Gly 


Gly 


Pro 


Leu 


Val 


Asn 


Leu 


Asp Gly 


Glu 


val 


He Gly 


Val 


305 










310 










315 










320 


Asn 


Thr Met 


Lys 


val 


Thr 


Ala 


Gly 


He 


Ser 


Phe 


Ala 


He 


Pro 


Ser 


Asp 










325 










330 










335 




Arg 


Leu 


Arg 


Glu 


Phe 


Leu 


His 


Arg 


Gly 


Glu 


Lys 


Lys 


Asn 


Ser 


Ser 


Ser 








340 










345 










350 






Gly 


lie 


Ser 


Gly 


Ser 


Gin 


Arg 


Arg 


Tyr 


He 


Gly 


val 


Met 


Met 


Leu 


Thr 






355 










360 










365 








Leu 


Ser 


Pro 


Ser 


lie 


Leu 


Ala 


Glu 


Leu 


Gin 


Leu 


Arg 


Glu 


pro 


Ser 


Phe 




370 










375 










380 










pro Asp Val 


Gin 


His 


Gly 


Val 


Leu 


He 


His 


Lys 


Val 


He 


Leu 


Gly 


Ser 


385 










390 










395 










400 


Pro 


Ala 


His 


Arg 


Ala 


Gly 


Leu 


Arg 


Pro 


Gly Asp 


val 


He 


Leu 


Ala 


lie 










405 










410 










415 




Gly Glu 


Gin 


Met 


val 


Gin 


Asn 


Ala 


Glu 


ASp 


val 


Tyr 


Glu 


Ala 


val 


Arg 








420 










425 










430 






Thr 


Gin 


ser 


Gin 


Leu 


Ala 


val 


Gin 


He 


Arg 


Arg 


Gly Arg 


Glu 


Thr 


Leu 






435 










440 










445 








Thr 


Leu 


Tyr 


Val 


Thr 


Pro 


Glu 


Val 


Thr 


Glu 















450 455 



(2) INFORMATION FOR SEQ ID NO: 9: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2764 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
{v) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TGGGACAGGC AGCTCCGGGG TCCGCGGTTT CACATCGGAA ACAAAACAGC GGCTGGTCTG 60 

GAAGGAACCT GAGCTACGAG CCGCGGCGGC AGCGGGGCGG CGGGGAAGCG TATACCTAAT 120 

CTGGGAGCCT GCAAGTGACA ACAGCCTTTG CGGTCCTTAG" ACAGCTTGGC CTGGAGGAGA 180 

50 ACACATGAAA GAAAGAACCT CAAGAGGCTT TGTTTTCTGT GAAACAGTAT TTCTATACAG 240 

TTGCTCCAAT GACAGAGTTA CCTGCACCGT TGTCCTACTT CCAGAATGCA CAGATGTCTG 300 

AGGACAACCA CCTGAGCAAT ACTGTACGTA GCCAGAATGA CAATAGAGAA CGGCAGGAGC 360 

ACAACGACAG ACGGAGCCTT GGCCACCCTG AGCCATTATC TAATGGACGA CCCCAGGGTA 420 

ACTCCCGGCA GGTGGTGGAG CAAGATGAGG AAGAAGATGA GGAGCTGACA TTGAAATATG 480 

GCGCCAAGCA TGTGATCATG CTCTTTGTCC CTGTGACTCT CTGCATGGTG GTGGTCGTGG 540 

55 CTACCATTAA GTCAGTCAGC TTTTATACCC GGAAGGATGG GCAGCT AATC TATACCCCAT 600 

TCACAGAAGA TACCGAGACT GTGGGCCAGA GAGCCCT GCA CTCAATTCTG AATGCTGCCA 660 
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35 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 10: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 467 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

{ii} MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: N- terminal 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



Met Thr Glu Leu Pro Ala Pro Leu Ser Tyr Phe GLn Asn Ala Gin Met 

15 10 15 

Ser Glu Asp Asn His Leu Ser Asn Thr Val Arg Ser Gin Asn Asp Asn 
20 25 30 

50 Arg Glu Arg Gin Glu His Asn Asp Arg Arg Ser Leu Gly His Pro Glu 

35 40 45 

Pro Leu Ser Asn Gly Arg Pro Gin Gly Asn Ser Arg Gin Val Val Glu 

50 55 60 

Gin Asp Glu Glu Glu Asp Glu Glu Leu Thr Leu Lys Tyr Gly Ala Lys 
65 70 75 80 

55 His val lie Met Leu Phe Val Pro Val Thr Leu Cys Met val Val val 

85 90 95 



TCATGATCAG TGTCATTGTT GTCATGACTA TCCTCCTGGT GGTTCTGTAT AAATACAGGT 720 

GCTATAAGGT CATCCATGCC TGGCTTATTA TATCATCTCT ATTGTTGCTG TTCTTTTTTT 780 

CATTCATTTA CTTGGGGGAA GTGTTTAAAA CCTATAACGT TGCTGTGGAC TACATTACTG 840 

TTGCACTCCT GATCTGGAAT TTTGGTGTGG TGGGAATGAT TTCCATTCAC TGGAAAGGTC 900 

5 CACTTCGACT CCAGCAGGCA TATCTCATTA TGATTAGTGC CCTCATGGCC CTGGTGTTTA 960 

TCAAGTACCT CCCTGAATGG ACTGCGTGGC TCATCTTGGC TGTGATTTCA GTATATGATT 1020 

TAGTGGCTGT TTTGTGTCCG AAAGGTCCAC TTCGTATGCT GGTTGAAACA GCTCAGGAGA 1080 

GAAATGAAAC GCTTTTTCCA GCTCTCATTT ACTCCTCAAC AATGGTGTGG TTGGTGAATA 1140 

TGGCAGAAGG AGACCCGGAA GCTCAAAGGA GAGTATCCAA AAATTCCAAG TATAATGCAG 1200 

AAAGCACAGA AAGGGAGTCA CAAGACACTG TTGCAGAGAA TGATGATGGC GGGTTCAGTG 1260 

10 AGGAATGGGA AGCCCAGAGG GACAGTCATC TAGGGCCTCA TCGCTCTACA CCTGAGTCAC 1320 

GAGCTGCTGT CCAGGAACTT TCCAGCAGTA TCCTCGCTGG TGAAGACCCA GAGGAAAGGG 1380 

GAGTAAAACT TGGATTGGGA GATTTCATTT TCTACAGTGT TCTGGTTGGT AAAGCCTCAG 1440 

CAACAGCCAG TGGAGACTGG AACACAACCA TAGCCTGTTT CGTAGCCATA TTAATTGGTT 1500 

TGTGCCTTAC ATTATTACTC CTTGCCATTT TCAAGAAAGC ATTGCCAGCT CTTCCAATCT 1560 

CCATCACCTT TGGGCTTGTT TTCTACTTTG CCACAGATTA TCTTGTACAG CCTTTTATGG 1620 

15 ACCAATTAGC ATTCCAT CAA TTTTATATCT AGCATATTTG CGGTTAGAAT CCCATGGATG 1680 

TTTCTTCTTT GACTATAACC AAATCTGGGG AGGACAAAGG TGATTTTCCT GTGTCCACAT 1740 

CTAACAAAGT CAAGATTCCC GGCTGGACTT TTGCAGCTTC CTTCCAAGTC TTCCTGACCA 1800 

CCTTGCACTA TTGGACTTTG GAAGGAGGTG CCTATAGAAA ACGATTTTGA ACATACTTCA 1860 

TCGCAGTGGA CTGTGTCCCT CGGTGCAGAA ACTACCAGAT TTGAGGGACG AGGTCAAGGA 1920 

GATATGATAG GCCCGGAAGT TGCTGTGCCC CATCAGCAGC TTGACGCGTG GTCACAGGAC 1980 

20 GATTTCACTG ACACTGCGAA CTCTCAGGAC TACCGGTTAC CAAGAGGTTA GGTGAAGTGG 2040 

TTTAAACCAA ACGGAACTCT TCATCTTAAA CTACACGTTG AAA AT CAACC CAATAATTCT 2100 

GTATTAACTG AATTCTGAAC TTTTCAGGAG GTACTGTGAG GAAGAGCAGG CACCAGCAGC 2160 

AGAATGGGGA ATGGAGAGGT GGGCAGGGGT TCCAGCTTCC CTTTGATTTT TTGCTGCAGA 2220 

CTCATCCTTT TTAAATGAGA CTTGTTTTCC CCTCTCTTTG AGTCAAGTCA AATATGTAGA 2280 

TTGCCTTTGG CAATTCTTCT TCTCAAGCAC TGACACTCAT TACCGTCTGT GATTGCCATT 2340 

25 TCTTCCCAAG GCCAGTCTGA ACCTGAGGTT GCTTTATCCT AAAAGTTTTA ACCT CAGGTT 2400 

CCAAATTCAG TAAATTTTGG AAACAGTACA GCTATTTCTC ATCAATTCTC TATCATGTTG 2460 

AAGTCAAATT TGGATTTTCC ACCAAATTCT GAATTTGTAG ACATACTTGT ACGCTCACTT 2520 

GCCCCCAGAT GCCTCCTCTG TCCTCATTCT TCTCTCCCAC ACAAGCAGTC TTTTTCTACA 2580 

GCCAGTAAGG CAGCTCTGTC RTGGTAGCAG ATGGTCCCAT TATTCTAGGG TCTTACTCTT 2640 

TGTATGATGA AAAGAATGTG TTATGAATCG GTGCTGTCAG CCCTGCTGTC AGACCTTCTT 2700 

30 CCACAGCAAA TGAGATGTAT GCCCAAAGCG GTAGAATTAA AGAAGAGTAA AATGGCTGTT 2760 

2764 

GAAG 



25 



